<!DOCTYPE html>
<html  lang="zh">
<head>
    <meta charset="utf-8">
<title>Java 动手写爬虫: 三、爬取队列 - 一灰灰Blog</title>
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />



    <meta name="description" content="第三篇 爬取队列的实现 第二篇中，实现了深度爬取的过程，但其中一个比较明显的问题就是没有实现每个爬取作为一个独立的任务来执行；即串行的爬取网页中的链接；因此，这一篇将主要集中目标在并发的爬网页的问题上 目标是每个链接的爬取都当做一个独立的job来执行">
<meta name="keywords" content="爬虫">
<meta property="og:type" content="article">
<meta property="og:title" content="Java 动手写爬虫: 三、爬取队列">
<meta property="og:url" content="https://blog.hhui.top/hexblog/2017/07/07/Java-动手写爬虫-三、爬取队列/index.html">
<meta property="og:site_name" content="一灰灰Blog">
<meta property="og:description" content="第三篇 爬取队列的实现 第二篇中，实现了深度爬取的过程，但其中一个比较明显的问题就是没有实现每个爬取作为一个独立的任务来执行；即串行的爬取网页中的链接；因此，这一篇将主要集中目标在并发的爬网页的问题上 目标是每个链接的爬取都当做一个独立的job来执行">
<meta property="og:locale" content="zh-CN">
<meta property="og:image" content="https://blog.hhui.top/hexblog/images/og_image.png">
<meta property="og:updated_time" content="2018-07-25T14:55:41.000Z">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="Java 动手写爬虫: 三、爬取队列">
<meta name="twitter:description" content="第三篇 爬取队列的实现 第二篇中，实现了深度爬取的过程，但其中一个比较明显的问题就是没有实现每个爬取作为一个独立的任务来执行；即串行的爬取网页中的链接；因此，这一篇将主要集中目标在并发的爬网页的问题上 目标是每个链接的爬取都当做一个独立的job来执行">
<meta name="twitter:image" content="https://blog.hhui.top/hexblog/images/og_image.png">







<link rel="icon" href="/hexblog/images/avatar.jpg">


<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bulma@0.7.2/css/bulma.css">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.4.1/css/all.css">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Ubuntu:400,600|Source+Code+Pro">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/highlight.js@9.12.0/styles/docco.css">


    
    
    
    <style>body>.footer,body>.navbar,body>.section{opacity:0}</style>
    

    
    
    
    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/lightgallery@1.6.8/dist/css/lightgallery.min.css">
    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/justifiedGallery@3.7.0/dist/css/justifiedGallery.min.css">
    

    
    

<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/outdatedbrowser@1.1.5/outdatedbrowser/outdatedbrowser.min.css">


    
    
    
    

<link rel="stylesheet" href="/hexblog/css/back-to-top.css">


    
    

    
    
<script>
var _hmt = _hmt || [];
(function() {
    var hm = document.createElement("script");
    hm.src = "//hm.baidu.com/hm.js?028d9e53f991d9739ecc7cc42e13c500";
    var s = document.getElementsByTagName("script")[0];
    s.parentNode.insertBefore(hm, s);
})();
</script>

    
    

    
    
<link rel="stylesheet" href="/hexblog/css/progressbar.css">
<script src="https://cdn.jsdelivr.net/npm/pace-js@1.0.2/pace.min.js"></script>

    
    
    

    
    
    


<link rel="stylesheet" href="/hexblog/css/style.css">
</head>
<body class="is-3-column">
    <nav class="navbar navbar-main">
    <div class="container">
        <div class="navbar-brand is-flex-center">
            <a class="navbar-item navbar-logo" href="/hexblog/">
            
                <img src="/hexblog/images/avatar.jpg" alt="Java 动手写爬虫: 三、爬取队列" height="28">
            
            </a>
        </div>
        <div class="navbar-menu">
            
            <div class="navbar-start">
                
                <a class="navbar-item"
                href="/hexblog/.">首页</a>
                
                <a class="navbar-item"
                href="/hexblog/archives">归档</a>
                
                <a class="navbar-item"
                href="/hexblog/tags">标签</a>
                
                <a class="navbar-item"
                href="http://spring.hhui.top">Spring</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/Java/">Java</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/Python/">Python</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/DB/">DB</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/Shell/">Shell</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/Quick系列/">Quick系列</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/前端/">前端</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/开源/">开源</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/工具/">工具</a>
                
                <a class="navbar-item"
                href="/hexblog/categories/随笔/">随笔</a>
                
                <a class="navbar-item"
                href="/hexblog/about">关于</a>
                
            </div>
            
            <div class="navbar-end">
                
                    
                    
                    <a class="navbar-item" target="_blank" title="Download on GitHub" href="https://github.com/liuyueyi">
                        
                        <i class="fab fa-github"></i>
                        
                    </a>
                    
                
                
                
                <a class="navbar-item search" title="搜索" href="javascript:;">
                    <i class="fas fa-search"></i>
                </a>
                
            </div>
        </div>
    </div>
</nav>
    
    <section class="section">
        <div class="container">
            <div class="columns">
                <div class="column is-8-tablet is-8-desktop is-7-widescreen has-order-2 column-main"><div class="card">
    
        <span >
            <div class="thumbnail default_logo">
                <br/>
                <span >
                Java 动手写爬虫: 三、爬取队列
                <br>
                &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
                <span style="font-size:0.7em">by 一灰灰</span>
                </span>
            </div>
            </span>
    

    <div class="card-content article ">
        
        <div class="level article-meta is-size-7 is-uppercase is-mobile is-overflow-x-auto">
            <div class="level-left">
                <time class="level-item has-text-grey" datetime="2017-07-07T14:30:26.000Z">2017-07-07</time>
                
                <div class="level-item">
                <a class="has-link-black -link" href="/hexblog/categories/Quick系列/">Quick系列</a>&nbsp;/&nbsp;<a class="has-link-black -link" href="/hexblog/categories/Quick系列/QuickCrawler/">QuickCrawler</a>
                </div>
                
                
                <span class="level-item has-text-grey" style='font-size: 1.2em;'>
                    
                    
                    32 分钟 读完 (大约 4835 个字)
                </span>
                
                
                
                <span class="level-item has-text-grey" id="busuanzi_container_page_pv">
                    <i class="far fa-eye"></i>
                    <span id="busuanzi_value_page_pv">0</span>次访问
                </span>
                
            </div>
        </div>
        
        <h1 class="title is-size-3 is-size-4-mobile has-text-weight-normal">
            
                Java 动手写爬虫: 三、爬取队列
            
        </h1>
        <div class="content">
            
                <!-- 文章详情页 -->
                <div id="toc" class="toc-article">
                <strong class="toc-title"> 文章目录 </strong>
                    <ol class="toc"><li class="toc-item toc-level-1"><a class="toc-link" href="#第三篇-爬取队列的实现"><span class="toc-text">第三篇 爬取队列的实现</span></a><ol class="toc-child"><li class="toc-item toc-level-2"><a class="toc-link" href="#设计"><span class="toc-text">设计</span></a><ol class="toc-child"><li class="toc-item toc-level-3"><a class="toc-link" href="#分工说明"><span class="toc-text">分工说明</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#1-CrawlMeta"><span class="toc-text">1. CrawlMeta</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#2-FetchQueue"><span class="toc-text">2. FetchQueue</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#3-DefaultAbstractCrawlJob"><span class="toc-text">3. DefaultAbstractCrawlJob</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#4-Fetcher"><span class="toc-text">4. Fetcher</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#5-测试"><span class="toc-text">5. 测试</span></a></li></ol></li><li class="toc-item toc-level-2"><a class="toc-link" href="#改进"><span class="toc-text">改进</span></a><ol class="toc-child"><li class="toc-item toc-level-3"><a class="toc-link" href="#1-待改善点"><span class="toc-text">1. 待改善点</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#2-线程池"><span class="toc-text">2. 线程池</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#3-ResultFilter"><span class="toc-text">3. ResultFilter</span></a><ol class="toc-child"><li class="toc-item toc-level-4"><a class="toc-link" href="#计数配置-JobCount"><span class="toc-text">计数配置 JobCount</span></a></li></ol></li></ol></li><li class="toc-item toc-level-2"><a class="toc-link" href="#小结"><span class="toc-text">小结</span></a><ol class="toc-child"><li class="toc-item toc-level-3"><a class="toc-link" href="#缺陷"><span class="toc-text">缺陷</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#源码地址"><span class="toc-text">源码地址</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#相关博文"><span class="toc-text">相关博文</span></a></li></ol></li><li class="toc-item toc-level-2"><a class="toc-link" href="#II-其他"><span class="toc-text">II. 其他</span></a><ol class="toc-child"><li class="toc-item toc-level-3"><a class="toc-link" href="#一灰灰Blog：-https-liuyueyi-github-io-hexblog"><span class="toc-text">一灰灰Blog： https://liuyueyi.github.io/hexblog</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#声明"><span class="toc-text">声明</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#扫描关注"><span class="toc-text">扫描关注</span></a></li></ol></li></ol></li></ol>
                </div>
            

            <h1 id="第三篇-爬取队列的实现"><a href="#第三篇-爬取队列的实现" class="headerlink" title="第三篇 爬取队列的实现"></a>第三篇 爬取队列的实现</h1><blockquote>
<p>第二篇中，实现了深度爬取的过程，但其中一个比较明显的问题就是没有实现每个爬取作为一个独立的任务来执行；即串行的爬取网页中的链接；因此，这一篇将主要集中目标在并发的爬网页的问题上</p>
<p>目标是每个链接的爬取都当做一个独立的job来执行</p>
</blockquote>
<a id="more"></a>
<h2 id="设计"><a href="#设计" class="headerlink" title="设计"></a>设计</h2><h3 id="分工说明"><a href="#分工说明" class="headerlink" title="分工说明"></a>分工说明</h3><ul>
<li>每个job都是独立的爬取任务，且只爬取对应的网址</li>
<li>一个阻塞队列，用于保存所有需要爬取的网址</li>
<li>一个控制器，从队列中获取待爬取的链接，然后新建一个任务执行</li>
</ul>
<p><img src="https://static.oschina.net/uploads/img/201707/07122341_6yHD.png" alt="爬虫.png"></p>
<p>图解说明</p>
<ul>
<li><p>Fetcher: 从队列中获取 <code>CrawlMeta</code>, 然后创建一个Job任务开始执行</p>
</li>
<li><p>Job: 根据 <code>CrawlMeta</code> 爬取对应的网页，爬完之后将结果塞入 <code>ResultSelector</code></p>
</li>
<li><p>ResultSelector : 分析爬取的结果，将所有满足条件的链接抽出来，封装对应的 <code>CrawlMeta</code>塞入队列</p>
</li>
</ul>
<p>然后上面组成一个循环，即可实现自动的深度爬取</p>
<h3 id="1-CrawlMeta"><a href="#1-CrawlMeta" class="headerlink" title="1. CrawlMeta"></a>1. <code>CrawlMeta</code></h3><blockquote>
<p>meta对象，保存的是待爬取的url和对应的选择规则，链接过滤规则，现在则需要加一个当前深度的参数，表名当前爬取的url是第几层, 用于控制是否需要停止继续纵向的爬取</p>
</blockquote>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 当前爬取的深度</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-meta">@Getter</span></span><br><span class="line"><span class="hljs-meta">@Setter</span></span><br><span class="line"><span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> currentDepth = <span class="hljs-number">0</span>;</span><br></pre></td></tr></table></figure>
<h3 id="2-FetchQueue"><a href="#2-FetchQueue" class="headerlink" title="2. FetchQueue"></a>2. <code>FetchQueue</code></h3><blockquote>
<p>这个就是保存的待爬取网页的队列，其中包含两个数据结果</p>
<ul>
<li>toFetchQueue: <code>CrawlMeta</code> 队列，其中的都是需要爬取的url</li>
<li>urls: 所有爬取过or待爬取的url集合，用于去重</li>
</ul>
</blockquote>
<p>源码如下，需要注意一下几个点</p>
<ul>
<li>tag: 之所以留了这个，主要是考虑我们的系统中是否可以存在多个爬取队列，如果存在时，则可以用tag来表示这个队列的用途</li>
<li><code>addSeed</code> 方法，内部先判断是否已经进入过队列了，若爬取了则不丢入待爬取队列（这个去重方式可以与上一篇实现的去重方式进行对比）；获取队列中的第一个元素时，是没有加锁的，<code>ArrayBlockingQueue</code> 内部保障了线程安全</li>
</ul>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 待爬的网页队列</span></span><br><span class="line"><span class="hljs-comment"> * &lt;p&gt;</span></span><br><span class="line"><span class="hljs-comment"> * Created by yihui on 2017/7/6.</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">FetchQueue</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> FetchQueue DEFAULT_INSTANCE = newInstance(<span class="hljs-string">"default"</span>);</span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 表示爬取队列的标识</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> String tag;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 待爬取的网页队列</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> Queue&lt;CrawlMeta&gt; toFetchQueue = <span class="hljs-keyword">new</span> ArrayBlockingQueue&lt;&gt;(<span class="hljs-number">200</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 所有爬取过的url集合， 用于去重</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> Set&lt;String&gt; urls = ConcurrentHashMap.newKeySet();</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">private</span> <span class="hljs-title">FetchQueue</span><span class="hljs-params">(String tag)</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">this</span>.tag = tag;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> FetchQueue <span class="hljs-title">newInstance</span><span class="hljs-params">(String tag)</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> FetchQueue(tag);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 当没有爬取过时， 才丢入队列； 主要是避免重复爬取的问题</span></span><br><span class="line"><span class="hljs-comment">     *</span></span><br><span class="line"><span class="hljs-comment">     * <span class="hljs-doctag">@param</span> crawlMeta</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">addSeed</span><span class="hljs-params">(CrawlMeta crawlMeta)</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">if</span> (urls.contains(crawlMeta.getUrl())) &#123;</span><br><span class="line">            <span class="hljs-keyword">return</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="hljs-keyword">synchronized</span> (<span class="hljs-keyword">this</span>) &#123;</span><br><span class="line">            <span class="hljs-keyword">if</span> (urls.contains(crawlMeta.getUrl())) &#123;</span><br><span class="line">                <span class="hljs-keyword">return</span>;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">            urls.add(crawlMeta.getUrl());</span><br><span class="line">            toFetchQueue.add(crawlMeta);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> CrawlMeta <span class="hljs-title">pollSeed</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">return</span> toFetchQueue.poll();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<h3 id="3-DefaultAbstractCrawlJob"><a href="#3-DefaultAbstractCrawlJob" class="headerlink" title="3. DefaultAbstractCrawlJob"></a>3. <code>DefaultAbstractCrawlJob</code></h3><blockquote>
<p>默认的抽象爬取任务，第二篇<a href="https://my.oschina.net/u/566591/blog/1079070" target="_blank" rel="noopener">深度爬取</a>中是直接在这个job中执行了所有的深度爬取，这里我们需要抽里出来，改成每个job只爬取这个网页，至于网页内部的链接，则解析封装后丢入队列即可，不执行具体的抓去网页工作</p>
</blockquote>
<p>需要先增加两个成员变量</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 待爬取的任务队列</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-keyword">private</span> FetchQueue fetchQueue;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 解析的结果</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-keyword">private</span> CrawlResult crawlResult;</span><br></pre></td></tr></table></figure>
<p>然后执行爬取的逻辑修改一下，主要的逻辑基本上没有变化，只是将之前的迭代调用，改成塞入队列，改动如下</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 执行抓取网页</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">doFetchPage</span><span class="hljs-params">()</span> <span class="hljs-keyword">throws</span> Exception </span>&#123;</span><br><span class="line">    HttpResponse response = HttpUtils.request(<span class="hljs-keyword">this</span>.crawlMeta, httpConf);</span><br><span class="line">    String res = EntityUtils.toString(response.getEntity(), httpConf.getCode());</span><br><span class="line">    <span class="hljs-keyword">if</span> (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) &#123; <span class="hljs-comment">// 请求成功</span></span><br><span class="line">        <span class="hljs-keyword">this</span>.crawlResult = <span class="hljs-keyword">new</span> CrawlResult();</span><br><span class="line">        <span class="hljs-keyword">this</span>.crawlResult.setStatus(response.getStatusLine().getStatusCode(), response.getStatusLine().getReasonPhrase());</span><br><span class="line">        <span class="hljs-keyword">this</span>.crawlResult.setUrl(crawlMeta.getUrl());</span><br><span class="line">        <span class="hljs-keyword">this</span>.visit(<span class="hljs-keyword">this</span>.crawlResult);</span><br><span class="line">        <span class="hljs-keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">// 网页解析</span></span><br><span class="line">    <span class="hljs-keyword">this</span>.crawlResult = doParse(res, <span class="hljs-keyword">this</span>.crawlMeta);</span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">// 回调用户的网页内容解析方法</span></span><br><span class="line">    <span class="hljs-keyword">this</span>.visit(<span class="hljs-keyword">this</span>.crawlResult);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">// 解析返回的网页中的链接，将满足条件的扔到爬取队列中</span></span><br><span class="line">    <span class="hljs-keyword">int</span> currentDepth = <span class="hljs-keyword">this</span>.crawlMeta.getCurrentDepth();</span><br><span class="line">    <span class="hljs-keyword">if</span> (currentDepth &gt; depth) &#123;</span><br><span class="line">        <span class="hljs-keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    Elements elements = crawlResult.getHtmlDoc().select(<span class="hljs-string">"a[href]"</span>);</span><br><span class="line">    String src;</span><br><span class="line">    <span class="hljs-keyword">for</span> (Element element : elements) &#123;</span><br><span class="line">        <span class="hljs-comment">// 确保将相对地址转为绝对地址</span></span><br><span class="line">        src = element.attr(<span class="hljs-string">"abs:href"</span>);</span><br><span class="line">        <span class="hljs-keyword">if</span> (!matchRegex(src)) &#123;</span><br><span class="line">            <span class="hljs-keyword">continue</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        CrawlMeta meta = <span class="hljs-keyword">new</span> CrawlMeta(currentDepth + <span class="hljs-number">1</span>,</span><br><span class="line">                src,</span><br><span class="line">                <span class="hljs-keyword">this</span>.crawlMeta.getSelectorRules(),</span><br><span class="line">                <span class="hljs-keyword">this</span>.crawlMeta.getPositiveRegex(),</span><br><span class="line">                <span class="hljs-keyword">this</span>.crawlMeta.getNegativeRegex());</span><br><span class="line">        fetchQueue.addSeed(meta);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p><code>String res = EntityUtils.toString(response.getEntity(), httpConf.getCode());</code></p>
<p>上面的代码，与之前有一行需要注意下, 这里对结果进行解析时，之前没有考虑字符编码的问题，因此全部走的都是默认编码逻辑，对应的源码如下，其中 <code>defaultCharset = null</code>, 因此最终的编码可能是 <code>ISO_8859_1</code> 也可能是解析的编码方式，所以在不指定编码格式时，可能出现乱码问题</p>
<figure class="highlight plain hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">Charset charset = null;</span><br><span class="line"></span><br><span class="line">try &#123;</span><br><span class="line">    ContentType contentType = ContentType.get(entity);</span><br><span class="line">    if(contentType != null) &#123;</span><br><span class="line">        charset = contentType.getCharset();</span><br><span class="line">    &#125;</span><br><span class="line">&#125; catch (UnsupportedCharsetException var13) &#123;</span><br><span class="line">    throw new UnsupportedEncodingException(var13.getMessage());</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">if(charset == null) &#123;</span><br><span class="line">    charset = defaultCharset;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">if(charset == null) &#123;</span><br><span class="line">    charset = HTTP.DEF_CONTENT_CHARSET;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>为了解决乱码问题，在 <code>HttpConf</code> (与网络相关的配置项）中新添加了一个code参数，表示对应的编码，因为目前我们的教程还没有到网络相关的模块，所以先采用了最简单的实现方式，在<code>DefaultAbstractCrawlJob</code> 中加了一个方法（后面的测试会给出对应的使用姿势）</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-function"><span class="hljs-keyword">protected</span> <span class="hljs-keyword">void</span> <span class="hljs-title">setResponseCode</span><span class="hljs-params">(String code)</span> </span>&#123;</span><br><span class="line">  httpConf.setCode(code);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<h3 id="4-Fetcher"><a href="#4-Fetcher" class="headerlink" title="4. Fetcher"></a>4. <code>Fetcher</code></h3><blockquote>
<p>这个就是我们新增的爬取控制类，在这里实现从队列中获取任务，然后创建job来执行</p>
</blockquote>
<p>因为职责比较清晰，所以一个最简单的实现如下</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Fetcher</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> maxDepth;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">private</span> FetchQueue fetchQueue;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> FetchQueue <span class="hljs-title">addFeed</span><span class="hljs-params">(CrawlMeta feed)</span> </span>&#123;</span><br><span class="line">        fetchQueue.addSeed(feed);</span><br><span class="line">        <span class="hljs-keyword">return</span> fetchQueue;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">Fetcher</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">this</span>(<span class="hljs-number">0</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">Fetcher</span><span class="hljs-params">(<span class="hljs-keyword">int</span> maxDepth)</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">this</span>.maxDepth = maxDepth;</span><br><span class="line">        fetchQueue = FetchQueue.DEFAULT_INSTANCE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">public</span> &lt;T extends DefaultAbstractCrawlJob&gt; <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">start</span><span class="hljs-params">(Class&lt;T&gt; clz)</span> <span class="hljs-keyword">throws</span> Exception </span>&#123;</span><br><span class="line">        CrawlMeta crawlMeta;</span><br><span class="line">        <span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>;</span><br><span class="line">        <span class="hljs-keyword">while</span> (<span class="hljs-keyword">true</span>) &#123;</span><br><span class="line">            crawlMeta = fetchQueue.pollSeed();</span><br><span class="line">            <span class="hljs-keyword">if</span> (crawlMeta == <span class="hljs-keyword">null</span>) &#123;</span><br><span class="line">                Thread.sleep(<span class="hljs-number">200</span>);</span><br><span class="line">                <span class="hljs-keyword">if</span> (++i &gt; <span class="hljs-number">300</span>) &#123; <span class="hljs-comment">// 连续一分钟内没有数据时，退出</span></span><br><span class="line">                    <span class="hljs-keyword">break</span>;</span><br><span class="line">                &#125;</span><br><span class="line"></span><br><span class="line">                <span class="hljs-keyword">continue</span>;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            i = <span class="hljs-number">0</span>;</span><br><span class="line"></span><br><span class="line">            DefaultAbstractCrawlJob job = clz.newInstance();</span><br><span class="line">            job.setDepth(<span class="hljs-keyword">this</span>.maxDepth);</span><br><span class="line">            job.setCrawlMeta(crawlMeta);</span><br><span class="line">            job.setFetchQueue(fetchQueue);</span><br><span class="line"></span><br><span class="line">            <span class="hljs-keyword">new</span> Thread(job, <span class="hljs-string">"crawl-thread-"</span> + System.currentTimeMillis()).start();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<h3 id="5-测试"><a href="#5-测试" class="headerlink" title="5. 测试"></a>5. 测试</h3><p>测试代码与之前就有些区别了，比之前要简洁一些</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueueCrawlerTest</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueueCrawlerJob</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">DefaultAbstractCrawlJob</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">beforeRun</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">            <span class="hljs-comment">// 设置返回的网页编码</span></span><br><span class="line">            <span class="hljs-keyword">super</span>.setResponseCode(<span class="hljs-string">"gbk"</span>);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="hljs-meta">@Override</span></span><br><span class="line">        <span class="hljs-function"><span class="hljs-keyword">protected</span> <span class="hljs-keyword">void</span> <span class="hljs-title">visit</span><span class="hljs-params">(CrawlResult crawlResult)</span> </span>&#123;</span><br><span class="line">            System.out.println(Thread.currentThread().getName() + <span class="hljs-string">" ___ "</span> + crawlResult.getUrl());</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">void</span> <span class="hljs-title">main</span><span class="hljs-params">(String[] rags)</span> <span class="hljs-keyword">throws</span> Exception </span>&#123;</span><br><span class="line">        Fetcher fetcher = <span class="hljs-keyword">new</span> Fetcher(<span class="hljs-number">1</span>);</span><br><span class="line"></span><br><span class="line">        String url = <span class="hljs-string">"http://chengyu.t086.com/gushi/1.htm"</span>;</span><br><span class="line">        CrawlMeta crawlMeta = <span class="hljs-keyword">new</span> CrawlMeta();</span><br><span class="line">        crawlMeta.setUrl(url);</span><br><span class="line">        crawlMeta.addPositiveRegex(<span class="hljs-string">"http://chengyu.t086.com/gushi/[0-9]+\\.htm$"</span>);</span><br><span class="line"></span><br><span class="line">        fetcher.addFeed(crawlMeta);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">        fetcher.start(QueueCrawlerJob.class);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>输出结果如下</p>
<figure class="highlight bash hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">crawl-thread-1499333696153 ___ http://chengyu.t086.com/gushi/1.htm</span><br><span class="line">crawl-thread-1499333710801 ___ http://chengyu.t086.com/gushi/3.htm</span><br><span class="line">crawl-thread-1499333711142 ___ http://chengyu.t086.com/gushi/7.htm</span><br><span class="line">crawl-thread-1499333710801 ___ http://chengyu.t086.com/gushi/2.htm</span><br><span class="line">crawl-thread-1499333710802 ___ http://chengyu.t086.com/gushi/6.htm</span><br><span class="line">crawl-thread-1499333710801 ___ http://chengyu.t086.com/gushi/4.htm</span><br><span class="line">crawl-thread-1499333710802 ___ http://chengyu.t086.com/gushi/5.htm</span><br></pre></td></tr></table></figure>
<h2 id="改进"><a href="#改进" class="headerlink" title="改进"></a>改进</h2><blockquote>
<p>和之前一样，接下来就是对上面的实现进行缺点分析和改进</p>
</blockquote>
<h3 id="1-待改善点"><a href="#1-待改善点" class="headerlink" title="1. 待改善点"></a>1. 待改善点</h3><ul>
<li>Fetcher 中，每个任务都起一个线程，可以用线程池来优化管理</li>
<li>Job 中执行任务和结果分析没有拆分，离我们的job只做爬取的逻辑有一点差距</li>
<li>退出程序的逻辑比较猥琐</li>
<li>爬取网页的间隔时间可以加一下</li>
<li>频繁的Job对象创建与销毁，是否可以考虑对象池的方式减少gc</li>
</ul>
<h3 id="2-线程池"><a href="#2-线程池" class="headerlink" title="2. 线程池"></a>2. 线程池</h3><p>直接使用Java的线程池来操作，因为线程池有较多的配置参数，所以先定义一个配置类; 给了一个默认的配置项，这个可能并不满足实际的业务场景，参数配置需要和实际的爬取任务相关联，才可以达到最佳的使用体验</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">// Fetcher.java </span></span><br><span class="line"></span><br><span class="line">  <span class="hljs-meta">@Getter</span></span><br><span class="line">  <span class="hljs-meta">@Setter</span></span><br><span class="line">  <span class="hljs-meta">@ToString</span></span><br><span class="line">  <span class="hljs-meta">@NoArgsConstructor</span></span><br><span class="line">  <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ThreadConf</span> </span>&#123;</span><br><span class="line">      <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> coreNum = <span class="hljs-number">6</span>;</span><br><span class="line">      <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> maxNum = <span class="hljs-number">10</span>;</span><br><span class="line">      <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> queueSize = <span class="hljs-number">10</span>;</span><br><span class="line">      <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> aliveTime = <span class="hljs-number">1</span>;</span><br><span class="line">      <span class="hljs-keyword">private</span> TimeUnit timeUnit = TimeUnit.MINUTES;</span><br><span class="line">      <span class="hljs-keyword">private</span> String threadName = <span class="hljs-string">"crawl-fetch"</span>;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">      <span class="hljs-keyword">public</span> <span class="hljs-keyword">final</span> <span class="hljs-keyword">static</span> ThreadConf DEFAULT_CONF = <span class="hljs-keyword">new</span> ThreadConf();</span><br><span class="line">  &#125;</span><br></pre></td></tr></table></figure>
<p>线程池初始化</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="hljs-keyword">private</span> Executor executor;</span><br><span class="line"></span><br><span class="line"><span class="hljs-meta">@Setter</span></span><br><span class="line"><span class="hljs-keyword">private</span> ThreadConf threadConf;</span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 初始化线程池</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">private</span> <span class="hljs-keyword">void</span> <span class="hljs-title">initExecutor</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">    executor = <span class="hljs-keyword">new</span> ThreadPoolExecutor(threadConf.getCoreNum(),</span><br><span class="line">            threadConf.getMaxNum(),</span><br><span class="line">            threadConf.getAliveTime(),</span><br><span class="line">            threadConf.getTimeUnit(),</span><br><span class="line">            <span class="hljs-keyword">new</span> LinkedBlockingQueue&lt;&gt;(threadConf.getQueueSize()),</span><br><span class="line">            <span class="hljs-keyword">new</span> CustomThreadFactory(threadConf.getThreadName()),</span><br><span class="line">            <span class="hljs-keyword">new</span> ThreadPoolExecutor.CallerRunsPolicy());</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>任务执行，直接将原来的创建Thread方式改成线程池执行方式即可</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">// com.quick.hui.crawler.core.fetcher.Fetcher#start</span></span><br><span class="line"></span><br><span class="line">executor.execute(job);</span><br></pre></td></tr></table></figure>
<p>测试case与之前一样，输出有些区别（主要是线程的名不同）, 可以看到其中 <code>crawl-fetch-1</code> 有两个，因为我们设置的线程的 coreSize = 6 , 而实际的爬取任务有7个，说明有一个被重用了；当爬取任务较多时，这么做的好处就很明显了</p>
<figure class="highlight bash hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">crawl-fetch-1 ___ http://chengyu.t086.com/gushi/1.htm</span><br><span class="line">crawl-fetch-2 ___ http://chengyu.t086.com/gushi/2.htm</span><br><span class="line">crawl-fetch-5 ___ http://chengyu.t086.com/gushi/5.htm</span><br><span class="line">crawl-fetch-1 ___ http://chengyu.t086.com/gushi/7.htm</span><br><span class="line">crawl-fetch-3 ___ http://chengyu.t086.com/gushi/3.htm</span><br><span class="line">crawl-fetch-4 ___ http://chengyu.t086.com/gushi/4.htm</span><br><span class="line">crawl-fetch-6 ___ http://chengyu.t086.com/gushi/6.htm</span><br></pre></td></tr></table></figure>
<h3 id="3-ResultFilter"><a href="#3-ResultFilter" class="headerlink" title="3. ResultFilter"></a>3. ResultFilter</h3><blockquote>
<p>用于结果解析的类，扫描爬取网页中的链接，将满足条件的链接封装之后塞入待爬取队列</p>
</blockquote>
<p>这个实现比较简单，比较难处理的是如何判断是否抓取完的逻辑</p>
<p>一个简单的思路如下：</p>
<ul>
<li>从第0层（seed）出发, 可以知道第一层有count个任务</li>
<li>从第一层的第0个出发，有count10个任务； 第1个出发，有 count11个任务</li>
<li>从第二层的第0个出发，有count20个任务…</li>
</ul>
<p>当扫描到最后一层时，上一层的完成计数+1，如果此时上一次的完成计数正好等于任务数，则上上一层计数+1，依次知道第0层的计数等于count，此时才表示爬取完成</p>
<h4 id="计数配置-JobCount"><a href="#计数配置-JobCount" class="headerlink" title="计数配置 JobCount"></a>计数配置 JobCount</h4><p>每个爬取的job，都对应一个 <code>JobCount</code> , 注意其中的几个属性，以及要求保证 JobCount 的 id全局唯一</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-meta">@Getter</span></span><br><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">JobCount</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span> SEED_ID = <span class="hljs-number">1</span>;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> AtomicInteger idGen = <span class="hljs-keyword">new</span> AtomicInteger(<span class="hljs-number">0</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span> <span class="hljs-title">genId</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">return</span> idGen.addAndGet(<span class="hljs-number">1</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 该Job对应的唯一ID</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> id;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 该job对应父job的id</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> upperId;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 当前的层数</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> currentDepth;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 该job对应的网页中，子Job的数量</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> AtomicInteger jobCount = <span class="hljs-keyword">new</span> AtomicInteger(<span class="hljs-number">0</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 该Job对应的网页中， 子Job完成的数量</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-keyword">private</span> AtomicInteger finishCount = <span class="hljs-keyword">new</span> AtomicInteger(<span class="hljs-number">0</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">boolean</span> <span class="hljs-title">fetchOver</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">return</span> jobCount.get() == finishCount.get();</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment">     * 爬取完成一个子任务</span></span><br><span class="line"><span class="hljs-comment">     */</span></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">synchronized</span> <span class="hljs-keyword">boolean</span> <span class="hljs-title">finishJob</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">        finishCount.addAndGet(<span class="hljs-number">1</span>);</span><br><span class="line">        <span class="hljs-keyword">return</span> fetchOver();</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">JobCount</span><span class="hljs-params">(<span class="hljs-keyword">int</span> id, <span class="hljs-keyword">int</span> upperId, <span class="hljs-keyword">int</span> currentDepth, <span class="hljs-keyword">int</span> jobCount, <span class="hljs-keyword">int</span> finishCount)</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">this</span>.id = id;</span><br><span class="line">        <span class="hljs-keyword">this</span>.upperId = upperId;</span><br><span class="line">        <span class="hljs-keyword">this</span>.currentDepth = currentDepth;</span><br><span class="line">        <span class="hljs-keyword">this</span>.jobCount.set(jobCount);</span><br><span class="line">        <span class="hljs-keyword">this</span>.finishCount.set(finishCount);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>将Job任务与 JobCount关联，因此在 CrwalMeta 中新增两个属性</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 当前任务对应的 &#123;<span class="hljs-doctag">@link</span> JobCount#id &#125;</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-meta">@Getter</span></span><br><span class="line"><span class="hljs-meta">@Setter</span></span><br><span class="line"><span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> jobId;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 当前任务对应的 &#123;<span class="hljs-doctag">@link</span> JobCount#parentId &#125;</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-meta">@Getter</span></span><br><span class="line"><span class="hljs-meta">@Setter</span></span><br><span class="line"><span class="hljs-keyword">private</span> <span class="hljs-keyword">int</span> parentJobId;</span><br></pre></td></tr></table></figure>
<p>爬取队列中做出相应的调整，新增一个 isOver 属性，用于确定是否结束；一个 <code>jobCountMap</code> 用于记录每个Job的计数情况</p>
<p>对应的<code>FetchQueue</code> 修改代码如下， 需要注意的是几个<code>finishOneJob</code>方法的实现方式</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * JobCount 映射表， key为 &#123;<span class="hljs-doctag">@link</span> JobCount#id&#125;, value 为对应的JobCount</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-keyword">public</span> Map&lt;Integer, JobCount&gt; jobCountMap = <span class="hljs-keyword">new</span> ConcurrentHashMap&lt;&gt;();</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 爬取是否完成的标识</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-keyword">volatile</span> <span class="hljs-keyword">boolean</span> isOver = <span class="hljs-keyword">false</span>;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 当没有爬取过时， 才丢入队列； 主要是避免重复爬取的问题</span></span><br><span class="line"><span class="hljs-comment"> *</span></span><br><span class="line"><span class="hljs-comment"> * <span class="hljs-doctag">@param</span> crawlMeta</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">boolean</span> <span class="hljs-title">addSeed</span><span class="hljs-params">(CrawlMeta crawlMeta)</span> </span>&#123;</span><br><span class="line">    <span class="hljs-keyword">if</span> (urls.contains(crawlMeta.getUrl())) &#123;</span><br><span class="line">        <span class="hljs-keyword">return</span> <span class="hljs-keyword">false</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">synchronized</span> (<span class="hljs-keyword">this</span>) &#123;</span><br><span class="line">        <span class="hljs-keyword">if</span> (urls.contains(crawlMeta.getUrl())) &#123;</span><br><span class="line">            <span class="hljs-keyword">return</span> <span class="hljs-keyword">false</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">        urls.add(crawlMeta.getUrl());</span><br><span class="line">        toFetchQueue.add(crawlMeta);</span><br><span class="line">        <span class="hljs-keyword">return</span> <span class="hljs-keyword">true</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">public</span> CrawlMeta <span class="hljs-title">pollSeed</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">    <span class="hljs-keyword">return</span> toFetchQueue.poll();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">finishJob</span><span class="hljs-params">(CrawlMeta crawlMeta, <span class="hljs-keyword">int</span> count, <span class="hljs-keyword">int</span> maxDepth)</span> </span>&#123;</span><br><span class="line">    <span class="hljs-keyword">if</span> (finishOneJob(crawlMeta, count, maxDepth)) &#123;</span><br><span class="line">        isOver = <span class="hljs-keyword">true</span>;</span><br><span class="line">        System.out.println(<span class="hljs-string">"============ finish crawl! ======"</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 完成一个爬取任务</span></span><br><span class="line"><span class="hljs-comment"> *</span></span><br><span class="line"><span class="hljs-comment"> * <span class="hljs-doctag">@param</span> crawlMeta 爬取的任务</span></span><br><span class="line"><span class="hljs-comment"> * <span class="hljs-doctag">@param</span> count     爬取的网页上满足继续爬取的链接数</span></span><br><span class="line"><span class="hljs-comment"> * <span class="hljs-doctag">@return</span> 如果所有的都爬取完了， 则返回true</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">private</span> <span class="hljs-keyword">boolean</span> <span class="hljs-title">finishOneJob</span><span class="hljs-params">(CrawlMeta crawlMeta, <span class="hljs-keyword">int</span> count, <span class="hljs-keyword">int</span> maxDepth)</span> </span>&#123;</span><br><span class="line">    JobCount jobCount = <span class="hljs-keyword">new</span> JobCount(crawlMeta.getJobId(),</span><br><span class="line">            crawlMeta.getParentJobId(),</span><br><span class="line">            crawlMeta.getCurrentDepth(),</span><br><span class="line">            count, <span class="hljs-number">0</span>);</span><br><span class="line">    jobCountMap.put(crawlMeta.getJobId(), jobCount);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">if</span> (crawlMeta.getCurrentDepth() == <span class="hljs-number">0</span>) &#123; <span class="hljs-comment">// 爬取种子页时，特判一下</span></span><br><span class="line">        <span class="hljs-keyword">return</span> count == <span class="hljs-number">0</span>; <span class="hljs-comment">// 若没有子链接可以爬取， 则直接结束</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">if</span> (count == <span class="hljs-number">0</span> || crawlMeta.getCurrentDepth() == maxDepth) &#123;</span><br><span class="line">        <span class="hljs-comment">// 当前的为最后一层的job时， 上一层计数+1</span></span><br><span class="line">        <span class="hljs-keyword">return</span> finishOneJob(jobCountMap.get(crawlMeta.getParentJobId()));</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">return</span> <span class="hljs-keyword">false</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="hljs-comment">/**</span></span><br><span class="line"><span class="hljs-comment"> * 递归向上进行任务完成 +1</span></span><br><span class="line"><span class="hljs-comment"> *</span></span><br><span class="line"><span class="hljs-comment"> * <span class="hljs-doctag">@param</span> jobCount</span></span><br><span class="line"><span class="hljs-comment"> * <span class="hljs-doctag">@return</span> true 表示所有的任务都爬取完成</span></span><br><span class="line"><span class="hljs-comment"> */</span></span><br><span class="line"><span class="hljs-function"><span class="hljs-keyword">private</span> <span class="hljs-keyword">boolean</span> <span class="hljs-title">finishOneJob</span><span class="hljs-params">(JobCount jobCount)</span> </span>&#123;</span><br><span class="line">    <span class="hljs-keyword">if</span> (jobCount.finishJob()) &#123;</span><br><span class="line">        <span class="hljs-keyword">if</span> (jobCount.getCurrentDepth() == <span class="hljs-number">0</span>) &#123;</span><br><span class="line">            <span class="hljs-keyword">return</span> <span class="hljs-keyword">true</span>; <span class="hljs-comment">//  结束</span></span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="hljs-keyword">return</span> finishOneJob(jobCountMap.get(jobCount.getParentId()));</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">return</span> <span class="hljs-keyword">false</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>所以 Fetch 类中的循环判断条件调整为根据  fetchQueue的 isOver来作为判定条件</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-keyword">public</span> &lt;T extends DefaultAbstractCrawlJob&gt; <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">start</span><span class="hljs-params">(Class&lt;T&gt; clz)</span> <span class="hljs-keyword">throws</span> Exception </span>&#123;</span><br><span class="line">        CrawlMeta crawlMeta;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">while</span> (!fetchQueue.isOver) &#123;</span><br><span class="line">        crawlMeta = fetchQueue.pollSeed();</span><br><span class="line">        <span class="hljs-keyword">if</span> (crawlMeta == <span class="hljs-keyword">null</span>) &#123;</span><br><span class="line">            Thread.sleep(<span class="hljs-number">200</span>);</span><br><span class="line">            <span class="hljs-keyword">continue</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">        DefaultAbstractCrawlJob job = clz.newInstance();</span><br><span class="line">        job.setDepth(<span class="hljs-keyword">this</span>.maxDepth);</span><br><span class="line">        job.setCrawlMeta(crawlMeta);</span><br><span class="line">        job.setFetchQueue(fetchQueue);</span><br><span class="line"></span><br><span class="line">        executor.execute(job);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>至此上面实现了结束判定条件的设置，下面则是读 Job中的代码进行分拆，将爬取的网页中链接过滤逻辑，迁移到 <code>ResultFilter</code>中实现，基本上就是代码的迁移</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ResultFilter</span> </span>&#123;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">void</span> <span class="hljs-title">filter</span><span class="hljs-params">(CrawlMeta crawlMeta,</span></span></span><br><span class="line"><span class="hljs-function"><span class="hljs-params">                              CrawlResult crawlResult,</span></span></span><br><span class="line"><span class="hljs-function"><span class="hljs-params">                              FetchQueue fetchQueue,</span></span></span><br><span class="line"><span class="hljs-function"><span class="hljs-params">                              <span class="hljs-keyword">int</span> maxDepth)</span> </span>&#123;</span><br><span class="line">        <span class="hljs-keyword">int</span> count = <span class="hljs-number">0</span>;</span><br><span class="line">        <span class="hljs-keyword">try</span> &#123;</span><br><span class="line">            <span class="hljs-comment">// 解析返回的网页中的链接，将满足条件的扔到爬取队列中</span></span><br><span class="line">            <span class="hljs-keyword">int</span> currentDepth = crawlMeta.getCurrentDepth();</span><br><span class="line">            <span class="hljs-keyword">if</span> (currentDepth &gt;= maxDepth) &#123;</span><br><span class="line">                <span class="hljs-keyword">return</span>;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">            <span class="hljs-comment">// 当前的网址中可以继续爬的链接数</span></span><br><span class="line"></span><br><span class="line">            Elements elements = crawlResult.getHtmlDoc().select(<span class="hljs-string">"a[href]"</span>);</span><br><span class="line">            String src;</span><br><span class="line">            <span class="hljs-keyword">for</span> (Element element : elements) &#123;</span><br><span class="line">                <span class="hljs-comment">// 确保将相对地址转为绝对地址</span></span><br><span class="line">                src = element.attr(<span class="hljs-string">"abs:href"</span>);</span><br><span class="line">                <span class="hljs-keyword">if</span> (!matchRegex(crawlMeta, src)) &#123;</span><br><span class="line">                    <span class="hljs-keyword">continue</span>;</span><br><span class="line">                &#125;</span><br><span class="line"></span><br><span class="line">                CrawlMeta meta = <span class="hljs-keyword">new</span> CrawlMeta(</span><br><span class="line">                        JobCount.genId(),</span><br><span class="line">                        crawlMeta.getJobId(),</span><br><span class="line">                        currentDepth + <span class="hljs-number">1</span>,</span><br><span class="line">                        src,</span><br><span class="line">                        crawlMeta.getSelectorRules(),</span><br><span class="line">                        crawlMeta.getPositiveRegex(),</span><br><span class="line">                        crawlMeta.getNegativeRegex());</span><br><span class="line">                <span class="hljs-keyword">if</span> (fetchQueue.addSeed(meta)) &#123;</span><br><span class="line">                    count++;</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">        &#125; <span class="hljs-keyword">finally</span> &#123; <span class="hljs-comment">// 上一层爬完计数+1</span></span><br><span class="line">            fetchQueue.finishJob(crawlMeta, count, maxDepth);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">private</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">boolean</span> <span class="hljs-title">matchRegex</span><span class="hljs-params">(CrawlMeta crawlMeta, String url)</span> </span>&#123;</span><br><span class="line">        Matcher matcher;</span><br><span class="line">        <span class="hljs-keyword">for</span> (Pattern pattern : crawlMeta.getPositiveRegex()) &#123;</span><br><span class="line">            matcher = pattern.matcher(url);</span><br><span class="line">            <span class="hljs-keyword">if</span> (matcher.find()) &#123;</span><br><span class="line">                <span class="hljs-keyword">return</span> <span class="hljs-keyword">true</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">        <span class="hljs-keyword">for</span> (Pattern pattern : crawlMeta.getNegativeRegex()) &#123;</span><br><span class="line">            matcher = pattern.matcher(url);</span><br><span class="line">            <span class="hljs-keyword">if</span> (matcher.find()) &#123;</span><br><span class="line">                <span class="hljs-keyword">return</span> <span class="hljs-keyword">false</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">        <span class="hljs-keyword">return</span> crawlMeta.getPositiveRegex().size() == <span class="hljs-number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>测试代码与之前加一点变化，将深度设置为2，抓去的正则有小的调整</p>
<figure class="highlight java hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueueCrawlerTest</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">    <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueueCrawlerJob</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">DefaultAbstractCrawlJob</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">beforeRun</span><span class="hljs-params">()</span> </span>&#123;</span><br><span class="line">            <span class="hljs-comment">// 设置返回的网页编码</span></span><br><span class="line">            <span class="hljs-keyword">super</span>.setResponseCode(<span class="hljs-string">"gbk"</span>);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="hljs-meta">@Override</span></span><br><span class="line">        <span class="hljs-function"><span class="hljs-keyword">protected</span> <span class="hljs-keyword">void</span> <span class="hljs-title">visit</span><span class="hljs-params">(CrawlResult crawlResult)</span> </span>&#123;</span><br><span class="line">            System.out.println(Thread.currentThread().getName() + <span class="hljs-string">"___"</span> + crawlMeta.getCurrentDepth() + <span class="hljs-string">"___"</span> + crawlResult.getUrl());</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="hljs-meta">@Test</span></span><br><span class="line">    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">testCrawel</span><span class="hljs-params">()</span> <span class="hljs-keyword">throws</span> Exception </span>&#123;</span><br><span class="line">        Fetcher fetcher = <span class="hljs-keyword">new</span> Fetcher(<span class="hljs-number">2</span>);</span><br><span class="line"></span><br><span class="line">        String url = <span class="hljs-string">"http://chengyu.t086.com/gushi/1.htm"</span>;</span><br><span class="line">        CrawlMeta crawlMeta = <span class="hljs-keyword">new</span> CrawlMeta();</span><br><span class="line">        crawlMeta.setUrl(url);</span><br><span class="line">        crawlMeta.addPositiveRegex(<span class="hljs-string">"http://chengyu.t086.com/gushi/[0-9]+\\.html$"</span>);</span><br><span class="line"></span><br><span class="line">        fetcher.addFeed(crawlMeta);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">        fetcher.start(QueueCrawlerJob.class);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>输出结果如下</p>
<figure class="highlight plain hljs"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br></pre></td><td class="code"><pre><span class="line">crawl-fetch-1___0___http://chengyu.t086.com/gushi/1.htm</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/673.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/683.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/687.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/672.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/686.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/688.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/684.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/670.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/669.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/685.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/671.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/679.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/677.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/682.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/681.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/676.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/660.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/680.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/675.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/678.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/674.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/668.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/667.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/666.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/665.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/662.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/661.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/651.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/657.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/658.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/663.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/664.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/659.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/656.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/655.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/653.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/652.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/654.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/650.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/648.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/649.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/647.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/640.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/644.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/645.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/643.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/646.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/641.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/642.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/639.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/635.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/637.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/634.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/629.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/638.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/633.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/632.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/636.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/630.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/631.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/627.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/628.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/617.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/625.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/622.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/624.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/626.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/623.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/621.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/620.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/614.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/618.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/612.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/611.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/619.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/616.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/615.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/605.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/613.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/610.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/609.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/608.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/606.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/607.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/603.html</span><br><span class="line">main___1___http://chengyu.t086.com/gushi/594.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/604.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/600.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/602.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/599.html</span><br><span class="line">crawl-fetch-3___1___http://chengyu.t086.com/gushi/601.html</span><br><span class="line">crawl-fetch-5___1___http://chengyu.t086.com/gushi/598.html</span><br><span class="line">crawl-fetch-6___1___http://chengyu.t086.com/gushi/596.html</span><br><span class="line">crawl-fetch-1___1___http://chengyu.t086.com/gushi/597.html</span><br><span class="line">crawl-fetch-4___1___http://chengyu.t086.com/gushi/593.html</span><br><span class="line">crawl-fetch-8___1___http://chengyu.t086.com/gushi/591.html</span><br><span class="line">crawl-fetch-9___1___http://chengyu.t086.com/gushi/595.html</span><br><span class="line">crawl-fetch-7___1___http://chengyu.t086.com/gushi/592.html</span><br><span class="line">main___2___http://chengyu.t086.com/gushi/583.html</span><br><span class="line">crawl-fetch-3___2___http://chengyu.t086.com/gushi/588.html</span><br><span class="line">crawl-fetch-10___1___http://chengyu.t086.com/gushi/590.html</span><br><span class="line">crawl-fetch-2___1___http://chengyu.t086.com/gushi/589.html</span><br><span class="line">crawl-fetch-5___2___http://chengyu.t086.com/gushi/579.html</span><br><span class="line">crawl-fetch-1___2___http://chengyu.t086.com/gushi/581.html</span><br><span class="line">crawl-fetch-7___2___http://chengyu.t086.com/gushi/584.html</span><br><span class="line">crawl-fetch-4___2___http://chengyu.t086.com/gushi/582.html</span><br><span class="line">crawl-fetch-3___2___http://chengyu.t086.com/gushi/587.html</span><br><span class="line">crawl-fetch-6___2___http://chengyu.t086.com/gushi/580.html</span><br><span class="line">crawl-fetch-9___2___http://chengyu.t086.com/gushi/585.html</span><br><span class="line">crawl-fetch-8___2___http://chengyu.t086.com/gushi/586.html</span><br><span class="line">crawl-fetch-10___2___http://chengyu.t086.com/gushi/578.html</span><br><span class="line">crawl-fetch-1___2___http://chengyu.t086.com/gushi/575.html</span><br><span class="line">crawl-fetch-2___2___http://chengyu.t086.com/gushi/577.html</span><br><span class="line">crawl-fetch-5___2___http://chengyu.t086.com/gushi/576.html</span><br><span class="line">crawl-fetch-7___2___http://chengyu.t086.com/gushi/574.html</span><br><span class="line">============ finish crawl! ======</span><br></pre></td></tr></table></figure>
<h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><blockquote>
<p>本片主要集中在一个爬取队列+线程池方式，来实现并发的爬取任务，同时实现了一个比较猥琐的结束爬取的方案</p>
</blockquote>
<h3 id="缺陷"><a href="#缺陷" class="headerlink" title="缺陷"></a>缺陷</h3><p>上面的实现，有一个非常明显的缺陷，就是相应的日志输出太少，下一篇博文将着手于此，将一些关键链路的日志信息打印出来；同时将剩下的几个待优化点一并做掉</p>
<p>到这里，基本上一个爬虫框架的雏形算是基本完成（当然还有很多问题，如队列的深度，JobCountMap可能爆掉，还有一些爬虫的基本注意事项等都有缺陷，但没关系，留待后续一点一点来完善）</p>
<h3 id="源码地址"><a href="#源码地址" class="headerlink" title="源码地址"></a>源码地址</h3><p>项目地址： <a href="https://github.com/liuyueyi/quick-crawler" target="_blank" rel="noopener">https://github.com/liuyueyi/quick-crawler</a></p>
<p>优化前对应的tag: <a href="https://github.com/liuyueyi/quick-crawler/releases/tag/v0.004" target="_blank" rel="noopener">v0.004</a></p>
<p>优化后对应的tag: <a href="https://github.com/liuyueyi/quick-crawler/releases/tag/v0.005" target="_blank" rel="noopener">v0.005</a></p>
<h3 id="相关博文"><a href="#相关博文" class="headerlink" title="相关博文"></a>相关博文</h3><p><a href="https://liuyueyi.github.io/hexblog/categories/%E6%8A%80%E6%9C%AF/Quick%E7%B3%BB%E5%88%97%E9%A1%B9%E7%9B%AE/QuickCrawler/" target="_blank" rel="noopener">Quick-Crawel爬虫系列博文</a></p>
<h2 id="II-其他"><a href="#II-其他" class="headerlink" title="II. 其他"></a>II. 其他</h2><h3 id="一灰灰Blog：-https-liuyueyi-github-io-hexblog"><a href="#一灰灰Blog：-https-liuyueyi-github-io-hexblog" class="headerlink" title="一灰灰Blog： https://liuyueyi.github.io/hexblog"></a><a href="https://liuyueyi.github.io/hexblog" target="_blank" rel="noopener">一灰灰Blog</a>： <a href="https://liuyueyi.github.io/hexblog" target="_blank" rel="noopener">https://liuyueyi.github.io/hexblog</a></h3><p>一灰灰的个人博客，记录所有学习和工作中的博文，欢迎大家前去逛逛</p>
<h3 id="声明"><a href="#声明" class="headerlink" title="声明"></a>声明</h3><p>尽信书则不如，已上内容，纯属一家之言，因个人能力有限，难免有疏漏和错误之处，如发现bug或者有更好的建议，欢迎批评指正，不吝感激</p>
<ul>
<li>微博地址: <a href="https://weibo.com/p/1005052169825577/home" target="_blank" rel="noopener">小灰灰Blog</a></li>
<li>QQ： 一灰灰/3302797840</li>
</ul>
<h3 id="扫描关注"><a href="#扫描关注" class="headerlink" title="扫描关注"></a>扫描关注</h3><p><img src="https://raw.githubusercontent.com/liuyueyi/Source/master/img/info/blogInfoV2.png" alt="QrCode"></p>


            
        </div>
        
        <div class="level is-size-7 is-uppercase">
            <div class="level-start">
                <div class="level-item">
                    <span class="is-size-6 has-text-grey has-mr-7">#</span>
                    <a class="has-link-grey -link" href="/hexblog/tags/爬虫/">爬虫</a>
                </div>
            </div>
        </div>
        
        
        
        <div class="bdsharebuttonbox">
    <a href="#" class="bds_more" data-cmd="more"></a>
    <a href="#" class="bds_qzone" data-cmd="qzone" title="分享到QQ空间"></a>
    <a href="#" class="bds_tsina" data-cmd="tsina" title="分享到新浪微博"></a>
    <a href="#" class="bds_tqq" data-cmd="tqq" title="分享到腾讯微博"></a>
    <a href="#" class="bds_renren" data-cmd="renren" title="分享到人人网"></a>
    <a href="#" class="bds_weixin" data-cmd="weixin" title="分享到微信"></a>
</div>
<script>window._bd_share_config = { "common": { "bdSnsKey": {}, "bdText": "", "bdMini": "2", "bdPic": "", "bdStyle": "0", "bdSize": "16" }, "share": {} }; with (document) 0[(getElementsByTagName('head')[0] || body).appendChild(createElement('script')).src = 'http://bdimg.share.baidu.com/static/api/js/share.js?v=89860593.js?cdnversion=' + ~(-new Date() / 36e5)];</script>
        
    </div>
</div>



<div class="card">
    <div class="card-content">
        <h3 class="menu-label has-text-centered">喜欢这篇文章？打赏一下作者吧</h3>
        <div class="buttons is-centered">
            
                
<a class="button is-info donate">
    <span class="icon is-small">
        <i class="fab fa-alipay"></i>
    </span>
    <span>支付宝</span>
    <div class="qrcode"><img src="https://s3.mogucdn.com/mlcdn/c45406/180104_0e6afl33b23lacj6ji2d7d060aiak_798x800.png" alt="支付宝"></div>
</a>

                
                
<a class="button is-success donate">
    <span class="icon is-small">
        <i class="fab fa-weixin"></i>
    </span>
    <span>微信</span>
    <div class="qrcode"><img src="https://s11.mogucdn.com/mlcdn/c45406/180527_09cafb94a5g3lbd5ik6ke0hf649ff_800x798.jpg" alt="微信"></div>
</a>

                
        </div>
    </div>
</div>



<div class="card card-transparent">
    <div class="level post-navigation is-flex-wrap is-mobile">
        
        <div class="level-start">
            <a class="level level-item has-link-grey  article-nav-prev" href="/hexblog/2017/07/27/Java-动手写爬虫-四、日志埋点输出-动态配置支持/">
                <i class="level-item fas fa-chevron-left"></i>
                <span class="level-item">Java 动手写爬虫: 四、日志埋点输出 &amp; 动态配置支持</span>
            </a>
        </div>
        
        
        <div class="level-end">
            <a class="level level-item has-link-grey  article-nav-next" href="/hexblog/2017/06/30/Java-动手写爬虫-二、-深度爬取/">
                <span class="level-item">Java 动手写爬虫: 二、 深度爬取</span>
                <i class="level-item fas fa-chevron-right"></i>
            </a>
        </div>
        
    </div>
</div>



<div class="card">
    <div class="card-content">
        <h3 class="title is-5 has-text-weight-normal">评论</h3>
        
<div id="comment-container"></div>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/gitalk@1.4.1/dist/gitalk.css">
<script src="https://cdn.jsdelivr.net/npm/gitalk@1.4.1/dist/gitalk.min.js"></script>
<script>
    var gitalk = new Gitalk({
        clientID: 'f4c5b85c7c1ceb8fbe2d',
        clientSecret: 'e8c365a05e6ab22272bee0b79ab27f69ff10a43a',
        id: '740ae0863bc499a8ee5f8245044de904',
        repo: 'hexblog',
        owner: 'liuyueyi',
        admin: "liuyueyi"
    })
    gitalk.render('comment-container')
</script>

    </div>
</div>
</div>
                




<div class="column is-4-tablet is-4-desktop is-3-widescreen  has-order-1 column-left ">
    
        
<div class="card widget">
    <div class="card-content card-info">
        <nav class="level">
            <div class="level-item has-text-centered">
                <div>
                    
                        <img class="image is-128x128 has-mb-6" src="/hexblog/images/avatar.jpg" alt="一灰灰Blog">
                    
                    
                    <p class="is-size-4 is-block">
                        一灰灰Blog
                    </p>
                    
                    
                    <p class="is-size-6 is-block">
                        Java，服务器后端开发
                    </p>
                    
                    
                    <p class="is-size-6 is-flex is-flex-center has-text-grey">
                        <i class="fas fa-map-marker-alt has-mr-7"></i>
                        <span>Wuhan, China</span>
                    </p>
                    
                </div>
            </div>
        </nav>
        <nav class="level is-mobile">
            <div class="level-item has-text-centered is-marginless">
                <div>
                    
                    <p class="heading">
                        文章
                    </p>
                    <a href="/hexblog/archives">
                    <p class="title has-text-weight-normal">
                        269
                    </p>
                    </a>
                </div>
            </div>
            <div class="level-item has-text-centered is-marginless">
                <div>
                    
                    <p class="heading">
                        分类
                    </p>
                    <a href="/hexblog/categories">
                        <p class="title has-text-weight-normal">
                            70
                        </p>
                    </a>
                </div>
            </div>
            <div class="level-item has-text-centered is-marginless">
                <div>
                
                    <p class="heading">
                        标签
                    </p>
                    <a href="/hexblog/tags">
                    <p class="title has-text-weight-normal">
                        74
                    </p>
                    </a>
                </div>
            </div>
        </nav>
        <div class="level">
            <a class="level-item button is-link is-rounded" href="https://github.com/liuyueyi" target="_blank">
                关注我</a>
        </div>
        
        
        <div class="level is-mobile">
            
            <a class="level-item button is-white is-marginless" target="_blank"
                title="Github" href="https://github.com/liuyueyi">
                
                <i class="fab fa-github"></i>
                
            </a>
            
            <a class="level-item button is-white is-marginless" target="_blank"
                title="Gitee" href="https://gitee.com/liuyueyi">
                
                <i class="fab fa-gg"></i>
                
            </a>
            
            <a class="level-item button is-white is-marginless" target="_blank"
                title="Weibo" href="https://weibo.com/p/1005052169825577/home">
                
                <i class="fab fa-weibo"></i>
                
            </a>
            
            <a class="level-item button is-white is-marginless" target="_blank"
                title="weixin" href="https://s10.mogucdn.com/mlcdn/c45406/171229_1cgld3igbelkbc70cd8af1j3809kb_150x150.jpg">
                
                <i class="fab fa-weixin"></i>
                
            </a>
            
            <a class="level-item button is-white is-marginless" target="_blank"
                title="RSS" href="/hexblog/atom.xml">
                
                <i class="fas fa-rss"></i>
                
            </a>
            
        </div>
        
    </div>
</div>
    
        
    
        

<div class="card widget">
    <div class="card-content">
        <div class="menu">
        <h3 class="menu-label">
            链接
        </h3>
        <ul class="menu-list">
        
            <li>
                <a class="level is-mobile" href="http://spring.hhui.top" target="_blank">
                    <span class="level-left">
                        <span class="level-item">SpringBlog</span>
                    </span>
                    <span class="level-right">
                        <span class="level-item tag">spring.hhui.top</span>
                    </span>
                </a>
            </li>
        
            <li>
                <a class="level is-mobile" href="https://github.com/liuyueyi" target="_blank">
                    <span class="level-left">
                        <span class="level-item">GitHub</span>
                    </span>
                    <span class="level-right">
                        <span class="level-item tag">github.com</span>
                    </span>
                </a>
            </li>
        
            <li>
                <a class="level is-mobile" href="https://zweb.hhui.top" target="_blank">
                    <span class="level-left">
                        <span class="level-item">多媒体工具小站</span>
                    </span>
                    <span class="level-right">
                        <span class="level-item tag">zweb.hhui.top</span>
                    </span>
                </a>
            </li>
        
            <li>
                <a class="level is-mobile" href="https://mweb.hhui.top" target="_blank">
                    <span class="level-left">
                        <span class="level-item">每日十首古诗词</span>
                    </span>
                    <span class="level-right">
                        <span class="level-item tag">mweb.hhui.top</span>
                    </span>
                </a>
            </li>
        
        </ul>
        </div>
    </div>
</div>


    
        
<div class="card widget">
    <div class="card-content">
        <h3 class="menu-label">
            最新文章
        </h3>
        
        <article class="media">
            
            <div class="media-content">
                <div class="content">
                    <div><time class="has-text-grey is-size-7 is-uppercase" datetime="2020-03-28T11:27:12.000Z">2020-03-28</time></div>
                    <a href="/hexblog/2020/03/28/200328-MongoDb系列教程九-文档-Document-查询基础篇/" class="has-link-black-ter is-size-6">200328-MongoDb系列教程九：文档 Document 查询基础篇</a>
                    <p class="is-size-7 is-uppercase">
                        <a class="has-link-grey -link" href="/hexblog/categories/DB/">DB</a> / <a class="has-link-grey -link" href="/hexblog/categories/DB/Mongo/">Mongo</a>
                    </p>
                </div>
            </div>
        </article>
        
        <article class="media">
            
            <div class="media-content">
                <div class="content">
                    <div><time class="has-text-grey is-size-7 is-uppercase" datetime="2020-03-27T03:04:36.000Z">2020-03-27</time></div>
                    <a href="/hexblog/2020/03/27/200327-MongoDb系列教程八-文档-Document-更新姿势/" class="has-link-black-ter is-size-6">200327-MongoDb系列教程八：文档 Document 更新姿势</a>
                    <p class="is-size-7 is-uppercase">
                        <a class="has-link-grey -link" href="/hexblog/categories/DB/">DB</a> / <a class="has-link-grey -link" href="/hexblog/categories/DB/Mongo/">Mongo</a>
                    </p>
                </div>
            </div>
        </article>
        
        <article class="media">
            
            <div class="media-content">
                <div class="content">
                    <div><time class="has-text-grey is-size-7 is-uppercase" datetime="2020-03-26T10:02:03.000Z">2020-03-26</time></div>
                    <a href="/hexblog/2020/03/26/200326-MongoDb系列教程七-文档-Document-删除姿势/" class="has-link-black-ter is-size-6">200326-MongoDb系列教程七：文档 Document 删除姿势</a>
                    <p class="is-size-7 is-uppercase">
                        <a class="has-link-grey -link" href="/hexblog/categories/DB/">DB</a> / <a class="has-link-grey -link" href="/hexblog/categories/DB/Mongo/">Mongo</a>
                    </p>
                </div>
            </div>
        </article>
        
        <article class="media">
            
            <div class="media-content">
                <div class="content">
                    <div><time class="has-text-grey is-size-7 is-uppercase" datetime="2020-03-26T09:04:06.000Z">2020-03-26</time></div>
                    <a href="/hexblog/2020/03/26/200326-MongoDb系列教程六-文档-Document-插入姿势/" class="has-link-black-ter is-size-6">200326-MongoDb系列教程六：文档 Document 插入姿势</a>
                    <p class="is-size-7 is-uppercase">
                        <a class="has-link-grey -link" href="/hexblog/categories/DB/">DB</a> / <a class="has-link-grey -link" href="/hexblog/categories/DB/Mongo/">Mongo</a>
                    </p>
                </div>
            </div>
        </article>
        
        <article class="media">
            
            <div class="media-content">
                <div class="content">
                    <div><time class="has-text-grey is-size-7 is-uppercase" datetime="2020-03-26T08:52:20.000Z">2020-03-26</time></div>
                    <a href="/hexblog/2020/03/26/200326-MongoDb系列教程五-集合-Collection/" class="has-link-black-ter is-size-6">200326-MongoDb系列教程五：集合 Collection</a>
                    <p class="is-size-7 is-uppercase">
                        <a class="has-link-grey -link" href="/hexblog/categories/DB/">DB</a> / <a class="has-link-grey -link" href="/hexblog/categories/DB/Mongo/">Mongo</a>
                    </p>
                </div>
            </div>
        </article>
        
    </div>
</div>

    
        
<div class="card widget">
    <div class="card-content">
        <h3 class="menu-label">
            标签云
        </h3>
        <a href="/hexblog/tags/Android/" style="font-size: 10px;">Android</a> <a href="/hexblog/tags/AutoCloseable/" style="font-size: 10px;">AutoCloseable</a> <a href="/hexblog/tags/BloomFilter/" style="font-size: 10px;">BloomFilter</a> <a href="/hexblog/tags/BugFix/" style="font-size: 10.63px;">BugFix</a> <a href="/hexblog/tags/Bugfix/" style="font-size: 10.63px;">Bugfix</a> <a href="/hexblog/tags/Docker/" style="font-size: 11.88px;">Docker</a> <a href="/hexblog/tags/FastJson/" style="font-size: 10px;">FastJson</a> <a href="/hexblog/tags/Git/" style="font-size: 11.25px;">Git</a> <a href="/hexblog/tags/Groovy/" style="font-size: 10px;">Groovy</a> <a href="/hexblog/tags/Guava/" style="font-size: 11.25px;">Guava</a> <a href="/hexblog/tags/HashMap/" style="font-size: 10px;">HashMap</a> <a href="/hexblog/tags/IO/" style="font-size: 10.63px;">IO</a> <a href="/hexblog/tags/ImageMagic/" style="font-size: 10.63px;">ImageMagic</a> <a href="/hexblog/tags/InfluxDB/" style="font-size: 16.88px;">InfluxDB</a> <a href="/hexblog/tags/InputStream/" style="font-size: 10px;">InputStream</a> <a href="/hexblog/tags/JDK/" style="font-size: 15px;">JDK</a> <a href="/hexblog/tags/JVM/" style="font-size: 11.25px;">JVM</a> <a href="/hexblog/tags/Java/" style="font-size: 20px;">Java</a> <a href="/hexblog/tags/JavaAgent/" style="font-size: 10.63px;">JavaAgent</a> <a href="/hexblog/tags/JavaWeb/" style="font-size: 10.63px;">JavaWeb</a> <a href="/hexblog/tags/Jquery/" style="font-size: 10px;">Jquery</a> <a href="/hexblog/tags/Linux/" style="font-size: 15px;">Linux</a> <a href="/hexblog/tags/List/" style="font-size: 10px;">List</a> <a href="/hexblog/tags/MD5/" style="font-size: 10px;">MD5</a> <a href="/hexblog/tags/Map/" style="font-size: 10px;">Map</a> <a href="/hexblog/tags/Maven/" style="font-size: 10px;">Maven</a> <a href="/hexblog/tags/Mongo/" style="font-size: 10.63px;">Mongo</a> <a href="/hexblog/tags/MongoDB/" style="font-size: 10px;">MongoDB</a> <a href="/hexblog/tags/MongoDb/" style="font-size: 14.38px;">MongoDb</a> <a href="/hexblog/tags/MySql/" style="font-size: 10.63px;">MySql</a> <a href="/hexblog/tags/Mybatis/" style="font-size: 10.63px;">Mybatis</a> <a href="/hexblog/tags/Mysql/" style="font-size: 16.25px;">Mysql</a> <a href="/hexblog/tags/Nginx/" style="font-size: 11.25px;">Nginx</a> <a href="/hexblog/tags/OGNL/" style="font-size: 11.25px;">OGNL</a> <a href="/hexblog/tags/ProtoStuff/" style="font-size: 10.63px;">ProtoStuff</a> <a href="/hexblog/tags/Python/" style="font-size: 19.38px;">Python</a> <a href="/hexblog/tags/QuickAlarm/" style="font-size: 10px;">QuickAlarm</a> <a href="/hexblog/tags/RabbitMQ/" style="font-size: 13.75px;">RabbitMQ</a> <a href="/hexblog/tags/ReactJS/" style="font-size: 10px;">ReactJS</a> <a href="/hexblog/tags/Redis/" style="font-size: 14.38px;">Redis</a> <a href="/hexblog/tags/Shell/" style="font-size: 15.63px;">Shell</a> <a href="/hexblog/tags/Socket/" style="font-size: 10px;">Socket</a> <a href="/hexblog/tags/Solr/" style="font-size: 10px;">Solr</a> <a href="/hexblog/tags/Spring/" style="font-size: 18.75px;">Spring</a> <a href="/hexblog/tags/SpringBoot/" style="font-size: 10px;">SpringBoot</a> <a href="/hexblog/tags/Vue/" style="font-size: 10px;">Vue</a> <a href="/hexblog/tags/WebSocket/" style="font-size: 10px;">WebSocket</a> <a href="/hexblog/tags/Yaml/" style="font-size: 10px;">Yaml</a> <a href="/hexblog/tags/css/" style="font-size: 11.25px;">css</a> <a href="/hexblog/tags/ffmpeg/" style="font-size: 10px;">ffmpeg</a> <a href="/hexblog/tags/gitalk/" style="font-size: 10px;">gitalk</a> <a href="/hexblog/tags/hexo/" style="font-size: 10px;">hexo</a> <a href="/hexblog/tags/jdk/" style="font-size: 10px;">jdk</a> <a href="/hexblog/tags/logger/" style="font-size: 10px;">logger</a> <a href="/hexblog/tags/markdown/" style="font-size: 10px;">markdown</a> <a href="/hexblog/tags/python/" style="font-size: 10px;">python</a> <a href="/hexblog/tags/time/" style="font-size: 10px;">time</a> <a href="/hexblog/tags/乱码/" style="font-size: 10px;">乱码</a> <a href="/hexblog/tags/二维码/" style="font-size: 10px;">二维码</a> <a href="/hexblog/tags/分库分表/" style="font-size: 10px;">分库分表</a> <a href="/hexblog/tags/反射/" style="font-size: 11.25px;">反射</a> <a href="/hexblog/tags/工具/" style="font-size: 12.5px;">工具</a> <a href="/hexblog/tags/并发/" style="font-size: 11.25px;">并发</a> <a href="/hexblog/tags/序列化/" style="font-size: 10px;">序列化</a> <a href="/hexblog/tags/手记/" style="font-size: 13.13px;">手记</a> <a href="/hexblog/tags/技术方案/" style="font-size: 18.13px;">技术方案</a> <a href="/hexblog/tags/指南/" style="font-size: 13.13px;">指南</a> <a href="/hexblog/tags/教程/" style="font-size: 17.5px;">教程</a> <a href="/hexblog/tags/方案设计/" style="font-size: 10px;">方案设计</a> <a href="/hexblog/tags/时区/" style="font-size: 10px;">时区</a> <a href="/hexblog/tags/时间窗口/" style="font-size: 11.25px;">时间窗口</a> <a href="/hexblog/tags/爬虫/" style="font-size: 12.5px;">爬虫</a> <a href="/hexblog/tags/问题记录/" style="font-size: 10px;">问题记录</a> <a href="/hexblog/tags/随笔/" style="font-size: 10px;">随笔</a>
    </div>
</div>

    
        <div class="card widget">
    <div class="card-content">
        <div class="menu">
            <h3 class="menu-label">
                标签
            </h3>
            <div class="field is-grouped is-grouped-multiline">
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Android/">
                        <span class="tag">Android</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/AutoCloseable/">
                        <span class="tag">AutoCloseable</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/BloomFilter/">
                        <span class="tag">BloomFilter</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/BugFix/">
                        <span class="tag">BugFix</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Bugfix/">
                        <span class="tag">Bugfix</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Docker/">
                        <span class="tag">Docker</span>
                        <span class="tag is-grey">4</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/FastJson/">
                        <span class="tag">FastJson</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Git/">
                        <span class="tag">Git</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Groovy/">
                        <span class="tag">Groovy</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Guava/">
                        <span class="tag">Guava</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/HashMap/">
                        <span class="tag">HashMap</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/IO/">
                        <span class="tag">IO</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/ImageMagic/">
                        <span class="tag">ImageMagic</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/InfluxDB/">
                        <span class="tag">InfluxDB</span>
                        <span class="tag is-grey">16</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/InputStream/">
                        <span class="tag">InputStream</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/JDK/">
                        <span class="tag">JDK</span>
                        <span class="tag is-grey">11</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/JVM/">
                        <span class="tag">JVM</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Java/">
                        <span class="tag">Java</span>
                        <span class="tag is-grey">59</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/JavaAgent/">
                        <span class="tag">JavaAgent</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/JavaWeb/">
                        <span class="tag">JavaWeb</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Jquery/">
                        <span class="tag">Jquery</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Linux/">
                        <span class="tag">Linux</span>
                        <span class="tag is-grey">11</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/List/">
                        <span class="tag">List</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/MD5/">
                        <span class="tag">MD5</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Map/">
                        <span class="tag">Map</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Maven/">
                        <span class="tag">Maven</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Mongo/">
                        <span class="tag">Mongo</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/MongoDB/">
                        <span class="tag">MongoDB</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/MongoDb/">
                        <span class="tag">MongoDb</span>
                        <span class="tag is-grey">9</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/MySql/">
                        <span class="tag">MySql</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Mybatis/">
                        <span class="tag">Mybatis</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Mysql/">
                        <span class="tag">Mysql</span>
                        <span class="tag is-grey">15</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Nginx/">
                        <span class="tag">Nginx</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/OGNL/">
                        <span class="tag">OGNL</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/ProtoStuff/">
                        <span class="tag">ProtoStuff</span>
                        <span class="tag is-grey">2</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Python/">
                        <span class="tag">Python</span>
                        <span class="tag is-grey">32</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/QuickAlarm/">
                        <span class="tag">QuickAlarm</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/RabbitMQ/">
                        <span class="tag">RabbitMQ</span>
                        <span class="tag is-grey">8</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/ReactJS/">
                        <span class="tag">ReactJS</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Redis/">
                        <span class="tag">Redis</span>
                        <span class="tag is-grey">9</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Shell/">
                        <span class="tag">Shell</span>
                        <span class="tag is-grey">13</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Socket/">
                        <span class="tag">Socket</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Solr/">
                        <span class="tag">Solr</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Spring/">
                        <span class="tag">Spring</span>
                        <span class="tag is-grey">25</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/SpringBoot/">
                        <span class="tag">SpringBoot</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Vue/">
                        <span class="tag">Vue</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/WebSocket/">
                        <span class="tag">WebSocket</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/Yaml/">
                        <span class="tag">Yaml</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/css/">
                        <span class="tag">css</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/ffmpeg/">
                        <span class="tag">ffmpeg</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/gitalk/">
                        <span class="tag">gitalk</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/hexo/">
                        <span class="tag">hexo</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/jdk/">
                        <span class="tag">jdk</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/logger/">
                        <span class="tag">logger</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/markdown/">
                        <span class="tag">markdown</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/python/">
                        <span class="tag">python</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/time/">
                        <span class="tag">time</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/乱码/">
                        <span class="tag">乱码</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/二维码/">
                        <span class="tag">二维码</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/分库分表/">
                        <span class="tag">分库分表</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/反射/">
                        <span class="tag">反射</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/工具/">
                        <span class="tag">工具</span>
                        <span class="tag is-grey">5</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/并发/">
                        <span class="tag">并发</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/序列化/">
                        <span class="tag">序列化</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/手记/">
                        <span class="tag">手记</span>
                        <span class="tag is-grey">6</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/技术方案/">
                        <span class="tag">技术方案</span>
                        <span class="tag is-grey">22</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/指南/">
                        <span class="tag">指南</span>
                        <span class="tag is-grey">6</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/教程/">
                        <span class="tag">教程</span>
                        <span class="tag is-grey">20</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/方案设计/">
                        <span class="tag">方案设计</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/时区/">
                        <span class="tag">时区</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/时间窗口/">
                        <span class="tag">时间窗口</span>
                        <span class="tag is-grey">3</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/爬虫/">
                        <span class="tag">爬虫</span>
                        <span class="tag is-grey">5</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/问题记录/">
                        <span class="tag">问题记录</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
                <div class="control">
                    <a class="tags has-addons" href="/hexblog/tags/随笔/">
                        <span class="tag">随笔</span>
                        <span class="tag is-grey">1</span>
                    </a>
                </div>
                
            </div>
        </div>
    </div>
</div>
    
    
        <div class="column-right-shadow is-hidden-widescreen ">
        
            
<div class="card widget">
    <div class="card-content">
        <div class="menu">
            <h3 class="menu-label">
                分类
            </h3>
            <ul class="menu-list">
            <li>
        <a class="level is-marginless" href="/hexblog/categories/DB/">
            <span class="level-start">
                <span class="level-item">DB</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">41</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/InfluxDB/">
            <span class="level-start">
                <span class="level-item">InfluxDB</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">16</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/Mongo/">
            <span class="level-start">
                <span class="level-item">Mongo</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">11</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/Mysql/">
            <span class="level-start">
                <span class="level-item">Mysql</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">13</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/分库分表/">
            <span class="level-start">
                <span class="level-item">分库分表</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/">
            <span class="level-start">
                <span class="level-item">Java</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">52</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/Agent/">
            <span class="level-start">
                <span class="level-item">Agent</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/Android/">
            <span class="level-start">
                <span class="level-item">Android</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/Bugfix/">
            <span class="level-start">
                <span class="level-item">Bugfix</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/IO/">
            <span class="level-start">
                <span class="level-item">IO</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/JDK/">
            <span class="level-start">
                <span class="level-item">JDK</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">22</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/JVM/">
            <span class="level-start">
                <span class="level-item">JVM</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">9</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/JavaWeb/">
            <span class="level-start">
                <span class="level-item">JavaWeb</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/其他/">
            <span class="level-start">
                <span class="level-item">其他</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/问题记录/">
            <span class="level-start">
                <span class="level-item">问题记录</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">6</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/">
            <span class="level-start">
                <span class="level-item">Python</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">33</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/Mongo/">
            <span class="level-start">
                <span class="level-item">Mongo</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/MySql/">
            <span class="level-start">
                <span class="level-item">MySql</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/教程/">
            <span class="level-start">
                <span class="level-item">教程</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">25</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/采坑记录/">
            <span class="level-start">
                <span class="level-item">采坑记录</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/项目/">
            <span class="level-start">
                <span class="level-item">项目</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/">
            <span class="level-start">
                <span class="level-item">Quick系列</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">30</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickAlarm/">
            <span class="level-start">
                <span class="level-item">QuickAlarm</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">8</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickCrawler/">
            <span class="level-start">
                <span class="level-item">QuickCrawler</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">5</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickFix/">
            <span class="level-start">
                <span class="level-item">QuickFix</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">6</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickMedia/">
            <span class="level-start">
                <span class="level-item">QuickMedia</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickSpi/">
            <span class="level-start">
                <span class="level-item">QuickSpi</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickTask/">
            <span class="level-start">
                <span class="level-item">QuickTask</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">5</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/">
            <span class="level-start">
                <span class="level-item">Shell</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">39</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/CMD/">
            <span class="level-start">
                <span class="level-item">CMD</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">18</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Docker/">
            <span class="level-start">
                <span class="level-item">Docker</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Git/">
            <span class="level-start">
                <span class="level-item">Git</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Maven/">
            <span class="level-start">
                <span class="level-item">Maven</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Ngins/">
            <span class="level-start">
                <span class="level-item">Ngins</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Nginx/">
            <span class="level-start">
                <span class="level-item">Nginx</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/环境搭建/">
            <span class="level-start">
                <span class="level-item">环境搭建</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">9</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/">
            <span class="level-start">
                <span class="level-item">前端</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">8</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Chrome/">
            <span class="level-start">
                <span class="level-item">Chrome</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Css/">
            <span class="level-start">
                <span class="level-item">Css</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Jquery/">
            <span class="level-start">
                <span class="level-item">Jquery</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/ReactJS/">
            <span class="level-start">
                <span class="level-item">ReactJS</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Vue/">
            <span class="level-start">
                <span class="level-item">Vue</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/工具/">
            <span class="level-start">
                <span class="level-item">工具</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">9</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/工具/工具类/">
            <span class="level-start">
                <span class="level-item">工具类</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">6</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/工具/插件系列/">
            <span class="level-start">
                <span class="level-item">插件系列</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/">
            <span class="level-start">
                <span class="level-item">开源</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">46</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Guava/">
            <span class="level-start">
                <span class="level-item">Guava</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Hexo/">
            <span class="level-start">
                <span class="level-item">Hexo</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Hystrix/">
            <span class="level-start">
                <span class="level-item">Hystrix</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Mybatis/">
            <span class="level-start">
                <span class="level-item">Mybatis</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/OGNL/">
            <span class="level-start">
                <span class="level-item">OGNL</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/RabbitMQ/">
            <span class="level-start">
                <span class="level-item">RabbitMQ</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">8</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Redis/">
            <span class="level-start">
                <span class="level-item">Redis</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Solr/">
            <span class="level-start">
                <span class="level-item">Solr</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Spring/">
            <span class="level-start">
                <span class="level-item">Spring</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">24</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Yaml/">
            <span class="level-start">
                <span class="level-item">Yaml</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/火花/">
            <span class="level-start">
                <span class="level-item">火花</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/随笔/">
            <span class="level-start">
                <span class="level-item">随笔</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">7</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/随笔/idea/">
            <span class="level-start">
                <span class="level-item">idea</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/随笔/吐槽/">
            <span class="level-start">
                <span class="level-item">吐槽</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li>
            </ul>
        </div>
    </div>
</div>
        
            <div class="card widget">
    <div class="card-content">
        <div class="menu">
        <h3 class="menu-label">
            归档
        </h3>
        <ul class="menu-list">
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2020/03/">
                <span class="level-start">
                    <span class="level-item">三月 2020</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">14</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2020/02/">
                <span class="level-start">
                    <span class="level-item">二月 2020</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">1</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2020/01/">
                <span class="level-start">
                    <span class="level-item">一月 2020</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">4</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/12/">
                <span class="level-start">
                    <span class="level-item">十二月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">5</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/11/">
                <span class="level-start">
                    <span class="level-item">十一月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">6</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/10/">
                <span class="level-start">
                    <span class="level-item">十月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">2</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/09/">
                <span class="level-start">
                    <span class="level-item">九月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">4</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/08/">
                <span class="level-start">
                    <span class="level-item">八月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">5</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/07/">
                <span class="level-start">
                    <span class="level-item">七月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/06/">
                <span class="level-start">
                    <span class="level-item">六月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">7</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/05/">
                <span class="level-start">
                    <span class="level-item">五月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">12</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/04/">
                <span class="level-start">
                    <span class="level-item">四月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">7</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/03/">
                <span class="level-start">
                    <span class="level-item">三月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">7</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/02/">
                <span class="level-start">
                    <span class="level-item">二月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">3</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/01/">
                <span class="level-start">
                    <span class="level-item">一月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">14</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/12/">
                <span class="level-start">
                    <span class="level-item">十二月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">8</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/11/">
                <span class="level-start">
                    <span class="level-item">十一月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">10</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/10/">
                <span class="level-start">
                    <span class="level-item">十月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">1</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/09/">
                <span class="level-start">
                    <span class="level-item">九月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">13</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/08/">
                <span class="level-start">
                    <span class="level-item">八月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/07/">
                <span class="level-start">
                    <span class="level-item">七月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">23</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/06/">
                <span class="level-start">
                    <span class="level-item">六月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">22</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/05/">
                <span class="level-start">
                    <span class="level-item">五月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/04/">
                <span class="level-start">
                    <span class="level-item">四月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/03/">
                <span class="level-start">
                    <span class="level-item">三月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">16</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/02/">
                <span class="level-start">
                    <span class="level-item">二月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">10</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/01/">
                <span class="level-start">
                    <span class="level-item">一月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">13</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/12/">
                <span class="level-start">
                    <span class="level-item">十二月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">6</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/09/">
                <span class="level-start">
                    <span class="level-item">九月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">1</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/08/">
                <span class="level-start">
                    <span class="level-item">八月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">3</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/07/">
                <span class="level-start">
                    <span class="level-item">七月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">2</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/06/">
                <span class="level-start">
                    <span class="level-item">六月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">2</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/05/">
                <span class="level-start">
                    <span class="level-item">五月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">4</span>
                </span>
            </a>
        </li>
        
        </ul>
        </div>
    </div>
</div>
        
        </div>
    
</div>

                




<div class="column is-4-tablet is-4-desktop is-2-widescreen is-hidden-touch is-hidden-desktop-only has-order-3 column-right ">
    
        
<div class="card widget">
    <div class="card-content">
        <div class="menu">
            <h3 class="menu-label">
                分类
            </h3>
            <ul class="menu-list">
            <li>
        <a class="level is-marginless" href="/hexblog/categories/DB/">
            <span class="level-start">
                <span class="level-item">DB</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">41</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/InfluxDB/">
            <span class="level-start">
                <span class="level-item">InfluxDB</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">16</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/Mongo/">
            <span class="level-start">
                <span class="level-item">Mongo</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">11</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/Mysql/">
            <span class="level-start">
                <span class="level-item">Mysql</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">13</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/DB/分库分表/">
            <span class="level-start">
                <span class="level-item">分库分表</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/">
            <span class="level-start">
                <span class="level-item">Java</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">52</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/Agent/">
            <span class="level-start">
                <span class="level-item">Agent</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/Android/">
            <span class="level-start">
                <span class="level-item">Android</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/Bugfix/">
            <span class="level-start">
                <span class="level-item">Bugfix</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/IO/">
            <span class="level-start">
                <span class="level-item">IO</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/JDK/">
            <span class="level-start">
                <span class="level-item">JDK</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">22</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/JVM/">
            <span class="level-start">
                <span class="level-item">JVM</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">9</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/JavaWeb/">
            <span class="level-start">
                <span class="level-item">JavaWeb</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/其他/">
            <span class="level-start">
                <span class="level-item">其他</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Java/问题记录/">
            <span class="level-start">
                <span class="level-item">问题记录</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">6</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/">
            <span class="level-start">
                <span class="level-item">Python</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">33</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/Mongo/">
            <span class="level-start">
                <span class="level-item">Mongo</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/MySql/">
            <span class="level-start">
                <span class="level-item">MySql</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/教程/">
            <span class="level-start">
                <span class="level-item">教程</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">25</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/采坑记录/">
            <span class="level-start">
                <span class="level-item">采坑记录</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Python/项目/">
            <span class="level-start">
                <span class="level-item">项目</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/">
            <span class="level-start">
                <span class="level-item">Quick系列</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">30</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickAlarm/">
            <span class="level-start">
                <span class="level-item">QuickAlarm</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">8</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickCrawler/">
            <span class="level-start">
                <span class="level-item">QuickCrawler</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">5</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickFix/">
            <span class="level-start">
                <span class="level-item">QuickFix</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">6</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickMedia/">
            <span class="level-start">
                <span class="level-item">QuickMedia</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickSpi/">
            <span class="level-start">
                <span class="level-item">QuickSpi</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Quick系列/QuickTask/">
            <span class="level-start">
                <span class="level-item">QuickTask</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">5</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/">
            <span class="level-start">
                <span class="level-item">Shell</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">39</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/CMD/">
            <span class="level-start">
                <span class="level-item">CMD</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">18</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Docker/">
            <span class="level-start">
                <span class="level-item">Docker</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Git/">
            <span class="level-start">
                <span class="level-item">Git</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Maven/">
            <span class="level-start">
                <span class="level-item">Maven</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Ngins/">
            <span class="level-start">
                <span class="level-item">Ngins</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/Nginx/">
            <span class="level-start">
                <span class="level-item">Nginx</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/Shell/环境搭建/">
            <span class="level-start">
                <span class="level-item">环境搭建</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">9</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/">
            <span class="level-start">
                <span class="level-item">前端</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">8</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Chrome/">
            <span class="level-start">
                <span class="level-item">Chrome</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Css/">
            <span class="level-start">
                <span class="level-item">Css</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Jquery/">
            <span class="level-start">
                <span class="level-item">Jquery</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/ReactJS/">
            <span class="level-start">
                <span class="level-item">ReactJS</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/前端/Vue/">
            <span class="level-start">
                <span class="level-item">Vue</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/工具/">
            <span class="level-start">
                <span class="level-item">工具</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">9</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/工具/工具类/">
            <span class="level-start">
                <span class="level-item">工具类</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">6</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/工具/插件系列/">
            <span class="level-start">
                <span class="level-item">插件系列</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/">
            <span class="level-start">
                <span class="level-item">开源</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">46</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Guava/">
            <span class="level-start">
                <span class="level-item">Guava</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Hexo/">
            <span class="level-start">
                <span class="level-item">Hexo</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Hystrix/">
            <span class="level-start">
                <span class="level-item">Hystrix</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Mybatis/">
            <span class="level-start">
                <span class="level-item">Mybatis</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">2</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/OGNL/">
            <span class="level-start">
                <span class="level-item">OGNL</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/RabbitMQ/">
            <span class="level-start">
                <span class="level-item">RabbitMQ</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">8</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Redis/">
            <span class="level-start">
                <span class="level-item">Redis</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">3</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Solr/">
            <span class="level-start">
                <span class="level-item">Solr</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Spring/">
            <span class="level-start">
                <span class="level-item">Spring</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">24</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/开源/Yaml/">
            <span class="level-start">
                <span class="level-item">Yaml</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li><li>
        <a class="level is-marginless" href="/hexblog/categories/火花/">
            <span class="level-start">
                <span class="level-item">火花</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">4</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/随笔/">
            <span class="level-start">
                <span class="level-item">随笔</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">7</span>
            </span>
        </a><ul><li>
        <a class="level is-marginless" href="/hexblog/categories/随笔/idea/">
            <span class="level-start">
                <span class="level-item">idea</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li><li>
        <a class="level is-marginless" href="/hexblog/categories/随笔/吐槽/">
            <span class="level-start">
                <span class="level-item">吐槽</span>
            </span>
            <span class="level-end">
                <span class="level-item tag">1</span>
            </span>
        </a></li></ul></li>
            </ul>
        </div>
    </div>
</div>
    
        <div class="card widget">
    <div class="card-content">
        <div class="menu">
        <h3 class="menu-label">
            归档
        </h3>
        <ul class="menu-list">
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2020/03/">
                <span class="level-start">
                    <span class="level-item">三月 2020</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">14</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2020/02/">
                <span class="level-start">
                    <span class="level-item">二月 2020</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">1</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2020/01/">
                <span class="level-start">
                    <span class="level-item">一月 2020</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">4</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/12/">
                <span class="level-start">
                    <span class="level-item">十二月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">5</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/11/">
                <span class="level-start">
                    <span class="level-item">十一月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">6</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/10/">
                <span class="level-start">
                    <span class="level-item">十月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">2</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/09/">
                <span class="level-start">
                    <span class="level-item">九月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">4</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/08/">
                <span class="level-start">
                    <span class="level-item">八月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">5</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/07/">
                <span class="level-start">
                    <span class="level-item">七月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/06/">
                <span class="level-start">
                    <span class="level-item">六月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">7</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/05/">
                <span class="level-start">
                    <span class="level-item">五月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">12</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/04/">
                <span class="level-start">
                    <span class="level-item">四月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">7</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/03/">
                <span class="level-start">
                    <span class="level-item">三月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">7</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/02/">
                <span class="level-start">
                    <span class="level-item">二月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">3</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2019/01/">
                <span class="level-start">
                    <span class="level-item">一月 2019</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">14</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/12/">
                <span class="level-start">
                    <span class="level-item">十二月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">8</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/11/">
                <span class="level-start">
                    <span class="level-item">十一月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">10</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/10/">
                <span class="level-start">
                    <span class="level-item">十月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">1</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/09/">
                <span class="level-start">
                    <span class="level-item">九月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">13</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/08/">
                <span class="level-start">
                    <span class="level-item">八月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/07/">
                <span class="level-start">
                    <span class="level-item">七月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">23</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/06/">
                <span class="level-start">
                    <span class="level-item">六月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">22</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/05/">
                <span class="level-start">
                    <span class="level-item">五月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/04/">
                <span class="level-start">
                    <span class="level-item">四月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">11</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/03/">
                <span class="level-start">
                    <span class="level-item">三月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">16</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/02/">
                <span class="level-start">
                    <span class="level-item">二月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">10</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2018/01/">
                <span class="level-start">
                    <span class="level-item">一月 2018</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">13</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/12/">
                <span class="level-start">
                    <span class="level-item">十二月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">6</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/09/">
                <span class="level-start">
                    <span class="level-item">九月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">1</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/08/">
                <span class="level-start">
                    <span class="level-item">八月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">3</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/07/">
                <span class="level-start">
                    <span class="level-item">七月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">2</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/06/">
                <span class="level-start">
                    <span class="level-item">六月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">2</span>
                </span>
            </a>
        </li>
        
        <li>
            <a class="level is-marginless" href="/hexblog/archives/2017/05/">
                <span class="level-start">
                    <span class="level-item">五月 2017</span>
                </span>
                <span class="level-end">
                    <span class="level-item tag">4</span>
                </span>
            </a>
        </li>
        
        </ul>
        </div>
    </div>
</div>
    
    
</div>

            </div>
        </div>
    </section>
    <footer class="footer">
    <div class="container">
        <div class="level">
            <div class="level-start has-text-centered-mobile">
                <a class="footer-logo is-block has-mb-6" href="/hexblog/">
                
                    <img src="/hexblog/images/avatar.jpg" alt="Java 动手写爬虫: 三、爬取队列" height="28">
                
                </a>
                <p class="is-size-7">
                &copy; 2020 YiHui&nbsp;
                Powered by <a href="https://hexo.io/" target="_blank">Hexo</a> & <a
                        href="https://github.com/ppoffice/hexo-theme-icarus" target="_blank">Icarus</a>
                
                </p>
            </div>
            <div class="level-end">
            
                <div class="field has-addons is-flex-center-mobile has-mt-5-mobile is-flex-wrap is-flex-middle">
                
                
                <p class="control">
                    <a class="button is-white is-large" target="_blank" title="Download on GitHub" href="https://github.com/liuyueyi">
                        
                        <i class="fab fa-github"></i>
                        
                    </a>
                </p>
                
                <p class="control">
                    <a class="button is-white is-large" target="_blank" title="Chart in Weibo" href="https://weibo.com/p/1005052169825577/home">
                        
                        <i class="fab fa-weibo"></i>
                        
                    </a>
                </p>
                
                <p class="control">
                    <a class="button is-white is-large" target="_blank" title="Frends with me" href="https://s10.mogucdn.com/mlcdn/c45406/171229_1cgld3igbelkbc70cd8af1j3809kb_150x150.jpg">
                        
                        <i class="fab fa-weixin"></i>
                        
                    </a>
                </p>
                
                </div>
            
            </div>
        </div>
    </div>
    <hr/>
    <div id="foot-pannel">
        <div class="outer">
            <div class="foot-column">
                <!--关注我们-->
                <div class="widget-wrap widget-list">
                    <h3 class="widget-title">更多平台</h3>
                    <div class="widget">
                        <ul>
                            <li>
                                <a href="//blog.hhui.top/">一灰灰Blog</a>
                            </li>
                        
                            <li>
                                <a href="//juejin.im/user/5a2a4b095188252ae93adbbf/posts">掘金</a>
                            </li>
                        
                            <li>
                                <a href="//my.oschina.net/u/566591">开源中国</a>
                            </li>
                        
                            <li>
                                <a href="//blog.csdn.net/liuyueyi25">CSDN</a>
                            </li>
                        
                            <li>
                                <a href="//www.jianshu.com/u/5902ab08e670">简书</a>
                            </li>
                        
                            <li>
                                <a href="//cloud.tencent.com/developer/column/1847">云+</a>
                            </li>
                        
                            <li>
                                <a href="//www.toutiao.com/c/user/69862071663/#mid=1579653107239950">头条</a>
                            </li>
                        
                            <li>
                                <a href="//github.com/liuyueyi">GitHub</a>
                            </li>
                        
                            <li>
                                <a href="//gitee.com/liuyueyi">Gitee</a>
                            </li>
                            
                        </ul>
                    </div>
                </div>
            </div>
            <div class="foot-column">
                <!--联系合作-->
                <div class="widget-wrap widget-list">
                    <h3 class="widget-title">一灰灰Blog</h3>
                    <div class="widget">
                        <ul>
                            
                                <li>
                                    <a href="#">QQ : 3302797840</a>
                                </li>
                            
                                <li>
                                    <a href="#">微信 : liuyueyi25</a>
                                </li>
                            
                                <li>
                                    <a href="#">邮箱 : bangzewu@126.com</a>
                                </li>
                            
                                <li>
                                    <a href="#">微博 : 一灰灰blog</a>
                                </li>
                            
                        </ul>
                    </div>
                </div>
                <div class="widget-wrap widget-list">
                    <h3 class="widget-title">友情链接</h3>
                    <div class="widget">
                        <ul>
                            
                                <span class="label">
                                    <a target="_blank" href="//zweb.hhui.top">zweb多媒体工具页</a>
                                </span>
                            
                                <span class="label">
                                    <a target="_blank" href="//mweb.hhui.top">mweb古诗选</a>
                                </span>
                            
                        </ul>
                    </div>
                </div>
            </div>
            <div class="foot-column">
                <!--友情链接-->
                <div class="widget-wrap widget-list">
                    <h3 class="widget-title">知识星球</h3>
                    <div class="widget">
                        <img style="width: 200px;" src="/hexblog/imgs/info/xingqiu.png">
                    </div>
                </div>
                
            </div>
            <div class="foot-column">
                <div class="widget-wrap widget-list">
                    <h3 class="widget-title">公众号</h3>
                    <div class="widget">
                        <img style="width: 200px;" src="/hexblog/imgs/info/wx.jpg">
                    </div>
                </div>
            </div>
        </div>
    </div>
    <div style="padding-top:20em">
        <hr/>
        <div id="foot-info">
            <span style="margin:4px;font-size:1em">
                &copy; 2020 <a target='_blank' href='https://github.com/liuyueyi'>一灰灰Blog</a> 版权所有 | <a href="http://www.beian.miit.gov.cn" target="_blank">鄂ICP备18017282号</a>
            </span>
            <br/>
            <span style="margin:4px;font-size:1em">
                <script type="text/javascript">document.write(unescape("%3Cspan id='cnzz_stat_icon_1278691600'%3E%3C/span%3E%3Cscript src='https://s9.cnzz.com/z_stat.php%3Fid%3D1278691600%26online%3D1%26show%3Dline' type='text/javascript'%3E%3C/script%3E"));</script>
            </span>
            <span style="margin:1em;font-size:1.4em">
                <label id="self_count_cnt"><br>本站总访量: <span class="visit_cnt">69330</span> | 总访问人次: <span class="visit_cnt">11586</span> | 恭喜您为第 <span class="visit_cnt">10840</span> 访问者</label>
            </span>
            <script src="/hexblog/js/count.js"></script>
        </div>
    </div>
</footer>
    <script src="https://cdn.jsdelivr.net/npm/jquery@3.3.1/dist/jquery.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/moment@2.22.2/min/moment-with-locales.min.js"></script>
<script>moment.locale("zh-CN");</script>


    
    
    
    <script src="/hexblog/js/animation.js"></script>
    

    
    
    
    <script src="https://cdn.jsdelivr.net/npm/lightgallery@1.6.8/dist/js/lightgallery.min.js" defer></script>
    <script src="https://cdn.jsdelivr.net/npm/justifiedGallery@3.7.0/dist/js/jquery.justifiedGallery.min.js" defer></script>
    <script src="/hexblog/js/gallery.js" defer></script>
    

    
    

<div id="outdated">
    <h6>Your browser is out-of-date!</h6>
    <p>Update your browser to view this website correctly. <a id="btnUpdateBrowser" href="http://outdatedbrowser.com/">Update
            my browser now </a></p>
    <p class="last"><a href="#" id="btnCloseUpdateBrowser" title="Close">&times;</a></p>
</div>
<script src="https://cdn.jsdelivr.net/npm/outdatedbrowser@1.1.5/outdatedbrowser/outdatedbrowser.min.js" defer></script>
<script>
    document.addEventListener("DOMContentLoaded", function () {
        outdatedBrowser({
            bgColor: '#f25648',
            color: '#ffffff',
            lowerThan: 'flex'
        });
    });
</script>


    
    
<script src="https://cdn.jsdelivr.net/npm/mathjax@2.7.5/unpacked/MathJax.js?config=TeX-MML-AM_CHTML" defer></script>
<script>
document.addEventListener('DOMContentLoaded', function () {
    MathJax.Hub.Config({
        'HTML-CSS': {
            matchFontHeight: false
        },
        SVG: {
            matchFontHeight: false
        },
        CommonHTML: {
            matchFontHeight: false
        },
        tex2jax: {
            inlineMath: [
                ['$','$'],
                ['\\(','\\)']
            ]
        }
    });
});
</script>

    
    

<a id="back-to-top" title="回到顶端" href="javascript:;">
    <i class="fas fa-chevron-up"></i>
</a>
<script src="/hexblog/js/back-to-top.js" defer></script>


    
    

    
    
    
    

    
    
    
    
    
    <script src="https://cdn.jsdelivr.net/npm/clipboard@2.0.4/dist/clipboard.min.js" defer></script>
    <script src="/hexblog/js/clipboard.js" defer></script>
    

    
    
    


<script src="/hexblog/js/main.js" defer></script>

    
    <div class="searchbox ins-search">
    <div class="searchbox-container ins-search-container">
        <div class="searchbox-input-wrapper">
            <input type="text" class="searchbox-input ins-search-input" placeholder="想要查找什么..." />
            <span class="searchbox-close ins-close ins-selectable"><i class="fa fa-times-circle"></i></span>
        </div>
        <div class="searchbox-result-wrapper ins-section-wrapper">
            <div class="ins-section-container"></div>
        </div>
    </div>
</div>
<script>
    (function (window) {
        var INSIGHT_CONFIG = {
            TRANSLATION: {
                POSTS: '文章',
                PAGES: '页面',
                CATEGORIES: '分类',
                TAGS: '标签',
                UNTITLED: '(无标题)',
            },
            CONTENT_URL: '/hexblog/content.json',
        };
        window.INSIGHT_CONFIG = INSIGHT_CONFIG;
    })(window);
</script>
<script src="/hexblog/js/insight.js" defer></script>
<link rel="stylesheet" href="/hexblog/css/search.css">
<link rel="stylesheet" href="/hexblog/css/insight.css">
    
</body>
</html>